Ahmad, Nor Azhar (2010) An enhanced LZ77 algorithm with hash table to compress large scale DNA sequence. Masters thesis, Universiti Teknologi Malaysia, Faculty of Computer Science and Information Systems.
The use of compression techniques in various fields of data management is very encouraging lately. DNA data size becomes large, and this causes a problem of storage and data transfer. Common approach used is to put this datum into the server which adds to the cost of data management. Furthermore, the transfer of online data is not the best solution anymore. For research center that has a low speed of Internet connection, the transfer is almost impossible to implement. This study proposed an enhancement of LZ77 algorithm, which is the common non-greedy, data dictionary type, using sliding windows concept for alphabethical data compression. By introducing sectioning sliding windows with hash table approach, the proposed compression algorithm can solve the storage problem of large DNA sequences. This implementation can speed up time and improve data compression rates. Two formats of DNA data (binary and FASTA) are tested and analysed. Simulation proved that, data compression rate shows promising results, with the addition of proportional size of the DNA, where it can compress at the rate of 56% per bit. Comparing to the LZ77 based DNA compression algorithm, BioCompress which has 44% of compress rate; the proposed algorithm has outperformed by 12%. Implications from this study will allow cost reduction in handling large scale DNA data.
|Item Type:||Thesis (Masters)|
|Additional Information:||Thesis (Sarjana Sains (Sains Komputer)) - Universiti Teknologi Malaysia, 2010; Supervisor : Assoc. Prof. Abd. Manan Ahmad|
|Uncontrolled Keywords:||data compression (Computer science), DNA data|
|Subjects:||Q Science > Q Science (General)|
Q Science > QA Mathematics > QA76 Computer software
|Divisions:||Computer Science and Information System|
|Deposited By:||Ramli Haron|
|Deposited On:||25 Jan 2012 00:37|
|Last Modified:||25 Jan 2012 00:37|
Repository Staff Only: item control page