Study on impact of changing the nature of data on the overall file compression ratio

  • muhammad amin zayyat university of aleppo
  • Mohammad Samir Modabbes
Keywords: Compression Ratio, Lossless Compression, Symbol replacement


This paper presents a study of a method to change the nature of the data by processing and replacing symbols in the pre-compression stage allowing to increase the frequency of symbols in the data to be compressed using one of the known compression algorithms, as this method replaces the most frequent symbol and the next most frequent symbol with one symbol carrying a sum of Their occurrences, then the next most frequent symbol with the one that is next to it is replaced with another symbol carrying the sum of their occurrences, and so on until all the symbols are replaced, this makes the number of the new symbols is half the number of the original ones, here in this research the ASCII coding system (American Standard Code for Information Interchange) has been studied. which consists of 256 characters of eight bits each, and to ensure that the data is not lost, each symbol has been referenced by one distinct bit written in a separate file, this bit will make restoring the original symbols possible. in the end, the two files merge together and seven-bits symbols are compressed using the LZW compression algorithm, the result is the final file contains more frequent symbols than the original file, as for decompression, it takes place in the opposite way, as the file is first decompressed using the LZW algorithm, and then the two files are separated from each other and a seven-bit symbol is read from the first file with a bit from the second file in order to restore the original symbol until all the original ASCII symbols are restored.

How to Cite
zayyat, muhammad amin, & Modabbes, M. (2022). Study on impact of changing the nature of data on the overall file compression ratio. Association of Arab Universities Journal of Engineering Sciences, 29(1), 56-63.