Another attempt at file compression :S

SoraK05

Well-Known Member
OP
Member
Joined
Aug 24, 2007
Messages
155
Trophies
0
XP
296
Country
Kenya
An idea came to me about a file having what I would call a 'magic number'.

Let's take these 5 bytes:
10 30 54 62 81

To calculate the magic number, you add the first byte, subtract the second, add the third, subtract the fourth, and so on.
It keeps adding/subtracting until it gets to the last byte, and the result is the magic number.

+10 - 30 +54 -62 +81
=53
53 is the magic number for those 5 bytes.

I believe no combination of those 5 bytes except that one will produce that magic number.
[EDIT was wrong about that.. :)]

To take this idea further, perhaps the amount of each byte can be recorded:
i.e. 10 00s, 12 01s, 5 02s, 4 03s could look like this:

10 12 05 04

These 4 bytes represent the 31 bytes here.

The idea is to shuffle those known bytes, calculating each combination, until it reaches the magic number.

I may be wrong, but I believe the likelihood of having the magic number repeating in those combinations is low.

On top of that, a hash can be saved of the file, so when a magic number is matching, it verifies the contents with the hash.


I presume the likelihood of getting a matching magic number is generally low, and having a string that matches the magic number and hash is even lower.

It may take a really long time though :S



EDIT
Just realized that there are at least 11 other matching magic numbers to those 5 bytes :S
10 - 30 + 81 - 62 + 54 = 53
10 - 62 + 54 - 30 + 81 = 53
10 - 62 + 81 - 30 + 54 = 53
54 - 30 + 10 - 62 + 81 = 53
54 - 30 + 81 - 62 + 10 = 53
54 - 62 + 10 - 30 + 81 = 53
54 - 62 + 81 - 30 + 10 = 53
81 - 30 + 10 - 62 + 54 = 53
81 - 30 + 54 - 62 + 10 = 53
81 - 62 + 10 - 30 + 54 = 53
81 - 62 + 54 - 30 + 10 = 53

It may not be as few as I thought.. :S

Trying to confirm if the number of matching magic number and hashes at the same time is low enough to record as its own 'combination value' to make it worth it as a file smaller than the original one..


EDIT
A friend of mine gave a mathematical calculation to show that every 81 bytes + has more than one matching magic number and SHA1 at the same time, and the amount of matches grows exponentially the more bytes after that.
So, it seems the magic number + SHA1 hash combination (i.e. 20 bytes) may or may not have a collision for every 80 bytes.
 
  • Like
Reactions: 1 person

FAST6191

Techromancer
Editorial Team
Joined
Nov 21, 2005
Messages
36,796
Trophies
3
XP
28,412
Country
United Kingdom
What you described is a minor tweak on bytesum aka second least advanced method of making a hash/checksum (the easiest being parity).

It is slightly better than your last idea of using a cryptographic grade hash (which by definition should not be able to be used for searching and data reconstruction purposes owing to the cascade effect beyond actually checking you got it right which as was mentioned in your previous thread is still not ideal) and brute force but you still seem to be heading towards/slowly reinventing the idea of data recovery parity (not to be confused with the basic concept of parity) as seen in the likes of the higher levels of RAID, PAR2 files and to a lesser extent in ECC ram.
Here each part of the file would be sampled more than necessary (oversampled) to recreate the file (you can define a straight line with two points but get four points and if one is wrong you still have the line) usually via something like a Fourier transform and then you can stick your processing power on recreating the file.

I am pretty much echoing the last thread though so I will stop there.
 

Blood Fetish

Quis custodiet ipsos custodes?
Member
Joined
Nov 3, 2002
Messages
1,100
Trophies
2
Age
44
Website
Visit site
XP
1,249
Country
United States
I appreciate your enthusiasm on compression algorithms, but you cannot losslessly compress random data. The only way to achieve lossless compression is through pattern matching, which is in and of itself not random.

edit: There is a reason why hashing is often referred to as "one-way encryption." It is not possible to derive with certainty the original message upon which the hash was based.
 

SoraK05

Well-Known Member
OP
Member
Joined
Aug 24, 2007
Messages
155
Trophies
0
XP
296
Country
Kenya
I've been trying something.

Let's use a 1 byte hash and 1 byte magic number.
Let's say:

Hash = FF
Magic number = FF (+ or - 256, it doesn't matter whether it is + or -, as long as it is 256).

I've been trying to go through the 'sequence' of this combination by hand
So far I got
1. FF
2. FF 00
3. 01 00 FE

I believe that is the 'sequence' so far..

I kinda stopped there because it will take a while to figure out all the 3 byte matches.

I may be wrong, but this match seems to be seldom..

Unless I'm mistaken, '3' in the sequence == 01 00 FE

3 bytes so far
FF FF 03
That would be hash FF magic FF and sequence 03

I know this still represents 3 bytes, but it may take me a while to figure out how many matches there are for the 3 byte combinations and more that match the hash and sequence, to have the 'hash/magic match "sequence"'. So far it appears to be seldom.

So far the magic number has seldom matches in proportion to the hash - at least up to 2 bytes.

Perhaps when the sequence is FF it represents a file >= 4 bytes (at least I'm hoping), potentially breaking even.

I'll try figure out how many matches there are with 3 bytes.
I'm also taking into account that this works for files starting with 01 or more.
 

Site & Scene News

Popular threads in this forum

General chit-chat
Help Users
  • No one is chatting at the moment.
  • Psionic Roshambo @ Psionic Roshambo:
    Float like a butterfly burns when I pee lol
    +1
  • BigOnYa @ BigOnYa:
    So does that mean your date was not good? It burns now?
    +1
  • K3Nv2 @ K3Nv2:
    Got two new stds in one night
    +1
  • BigOnYa @ BigOnYa:
    Giggity
    +1
  • The Catboy @ The Catboy:
    I don't bite! Minus the times when I did bite
  • The Catboy @ The Catboy:
    Like 5 minutes ago
  • K3Nv2 @ K3Nv2:
    Billie needs her lunch
  • K3Nv2 @ K3Nv2:
    Ffs papa brought back the cheeseburger pizza it's like the only decent pie they had since the 80s
  • BigOnYa @ BigOnYa:
    I'm not a fan of papa johns, but that does sound good. We hardly order out pizza, I like making my own, but when we do its donatoes
  • K3Nv2 @ K3Nv2:
    I get them like once every two months anymore
  • K3Nv2 @ K3Nv2:
    Just because it's half a mile from where I live
  • BigOnYa @ BigOnYa:
    Request next time you order, that Shaq deliver it to you
  • K3Nv2 @ K3Nv2:
    I want him to buy me a chain also
  • K3Nv2 @ K3Nv2:
    Open it right next to the one we have
    +1
  • BakerMan @ BakerMan:
    guys should i make a new thread and just count the amount of posts until kyle, luke or leo joins the thread for fun?
  • BakerMan @ BakerMan:
    kyle's fine, just waiting for that wario joke

    luke and leo though, they yap until the thread's enjoyability is about halved
  • K3Nv2 @ K3Nv2:
    Leo is Luke's alterego when he gets hard
  • BigOnYa @ BigOnYa:
    Luke is gone, he got banned. And I'm surprised Leo hasn't yet
  • K3Nv2 @ K3Nv2:
    Subway was actually pretty decent tonight
  • BigOnYa @ BigOnYa:
    Wut you get, a seafood and psi salad sub
  • K3Nv2 @ K3Nv2:
    Psi had my footlong meatball special
    +1
  • PandaPandel @ PandaPandel:
    i want a meatball sandwich
    now
  • K3Nv2 @ K3Nv2:
    Gay
  • BigOnYa @ BigOnYa:
    Bout time you came out and admitted it
  • K3Nv2 @ K3Nv2:
    Bigonya talks to himself often
    +1
    K3Nv2 @ K3Nv2: Bigonya talks to himself often +1