mollensoft

Hello All,
I am using the example code below to Compress/Decompress Files but it seems that when I try to compress image files (.JPG) and some other file types (.PDF) the output Compressed file is larger than the original input file....

Can anyone help me understand why this is occuring and perhaps how to avoid creating larger files when using VB-Xpress[2005] (thanks in advance :>)

--snip--
Imports System
Imports System.IO
Imports System.IO.Compression

Public Class GZipTest
Shared msg As String
Public Shared Function ReadAllBytesFromStream(ByVal stream As Stream, ByVal buffer() As Byte) As Integer
' Use this method is used to read all bytes from a stream.
Dim offset As Integer = 0
Dim totalCount As Integer = 0
While True
Dim bytesRead As Integer = stream.Read(buffer, offset, 100)
If bytesRead = 0 Then
Exit While
End If
offset += bytesRead
totalCount += bytesRead
End While
Return totalCount
End Function 'ReadAllBytesFromStream


Public Shared Function CompareData(ByVal buf1() As Byte, ByVal len1 As Integer, ByVal buf2() As Byte, ByVal len2 As Integer) As Boolean
' Use this method to compare data from two different buffers.
If len1 <> len2 Then
msg = "Number of bytes in two buffer are different" & len1 & ":" & len2
MsgBox(msg)
Return False
End If

Dim i As Integer
For i = 0 To len1 - 1
If buf1(i) <> buf2(i) Then
msg = "byte " & i & " is different " & buf1(i) & "|" & buf2(i)
MsgBox(msg)
Return False
End If
Next i
msg = "All bytes compare."
MsgBox(msg)
Return True
End Function 'CompareData


Public Shared Sub GZipCompressDecompress(ByVal filename As String)
msg = "Test compression and decompression on file " & filename
MsgBox(msg)

Dim infile As FileStream
Try
' Open the file as a FileStream object.
infile = New FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read)
Dim buffer(infile.Length - 1) As Byte
' Read the file to ensure it is readable.
Dim count As Integer = infile.Read(buffer, 0, buffer.Length)
If count <> buffer.Length Then
infile.Close()
msg = "Test Failed: Unable to read data from file"
MsgBox(msg)
Return
End If
infile.Close()
Dim ms As New MemoryStream()
' Use the newly created memory stream for the compressed data.
Dim compressedzipStream As New GZipStream(ms, CompressionMode.Compress, True)
compressedzipStream.Write(buffer, 0, buffer.Length)
' Close the stream.
compressedzipStream.Close()

msg = "Original size: " & buffer.Length & ", Compressed size: " & ms.Length
MsgBox(msg)

' Reset the memory stream position to begin decompression.
ms.Position = 0
Dim zipStream As New GZipStream(ms, CompressionMode.Decompress)
Dim decompressedBuffer(buffer.Length + 100) As Byte
' Use the ReadAllBytesFromStream to read the stream.
Dim totalCount As Integer = GZipTest.ReadAllBytesFromStream(zipStream, decompressedBuffer)
msg = "Decompressed " & totalCount & " bytes"
MsgBox(msg)

If Not GZipTest.CompareData(buffer, buffer.Length, decompressedBuffer, totalCount) Then
msg = "Error. The two buffers did not compare."
MsgBox(msg)

End If
zipStream.Close()
Catch e As Exception
msg = "Error: The file being read contains invalid data."
MsgBox(msg)
End Try

End Sub 'GZipCompressDecompress

Public Shared Sub Main(ByVal args() As String)
Dim usageText As String = "Usage: GZIPTEST <inputfilename>"
'If no file name is specified, write usage text.
If args.Length = 0 Then
Console.WriteLine(usageText)
Else
If File.Exists(args(0)) Then
GZipCompressDecompress(args(0))
End If
End If
End Sub 'Main
End Class

--EndOfSnip--



Re: Visual Basic Express Edition Gzip Compression Creates Larger-than-original When Compressing some types of files

JohnWein

When you try to compress a file that is already compressed the resultant file will often be slightly larger than the original file. Compression is based on the fact that most files contain redundant, structured patterns. After compression the file has a random structure. If you try to compress a file generated by a random number generator, you'll find that it can't be compressed. Any compression algorithm adds some overhead bytes to the file. If you want to avoid making files that are larger than the original, just compare the original and compressed files and keep the original if it is smaller. Usually though, it doesn't really matter because the two files will be close to the same size.

Edit: I meant to say

Usually though, it doesn't really matter because if the file isn't compressible, the two files will be close to the same size.





Re: Visual Basic Express Edition Gzip Compression Creates Larger-than-original When Compressing some types of files

nobugz

Also watch out for small files. Your code worked fine on compressible large files.





Re: Visual Basic Express Edition Gzip Compression Creates Larger-than-original When Compressing some types of files

mollensoft

Hi John,
Thank You for the explanation... it makes good sense... I never thought of compression in that fashion (reducing the redundant sequences) but it does seem quite logical.

Thanks again!

-Al Mollenkopf






Re: Visual Basic Express Edition Gzip Compression Creates Larger-than-original When Compressing some types of files

mollensoft

Thank You Hans, I see what you mean... I'll continue to tinker.

-Al Mollenkopf