How are scraps actually hashed?

jwh · March 4, 2024, 1:11am

Hey there!

Apologies if this has been answered already, I searched but couldn’t find anything.

How, specifically, does Scrapscript actually hash each scrap?
Which hashing algorithm is used?
What canonicalization (if any) takes place before computing the hash?

This is a fascinating project, I’m excited to see how it evolves. I found your homepage whilst implementing my own interpretation of a content-addressing solution for lambda functions. Scrapscript looks awesome and I’d use it in production today if I could!

Jack

surprisetalk · March 12, 2024, 4:23pm

Hi Jack,

Right now, you can specify which hash you are using with the following syntax:

$sha1'3efce6ae1ebf7fef7c7bdd8c270d76da5b079438

Note that SHA1 is somewhat problematic, which is why I used it as an example

Some folks (I forgot where) made very good points about the hash type declaration not really being that important. In theory, the scrapyard could hash the content using CRC32, SHA1, MD5, etc. and try to match against all at the same time in the same big KV store. The address space is really, really big!

The canonical format of a scrap is its “flat scrap”, which is a binary representation of a scrap (something like msgpack). I’m still working through the details on this, but I expect to canonicalize variable names in functions, so b -> b and a -> a should flatten to the same exact content. If possible, I’d like to keep “hydrated” variable names elsewhere, so that you can restore the original var names and other metadata in an editor. I won’t make any other optimizations, so () -> 1 + 2 and () -> 2 + 1 will remain different.

Topic		Replies	Views
Proposal: metadata General	4	274	July 4, 2023
Flat Scraps Sneak Peek General	4	124	March 3, 2025
Proposal: minimal rocks General	4	310	August 2, 2023
Proposal: is `@` notation worth it? General proposal	4	327	July 4, 2023
Proposal: bootscrapping from byte and bytes General proposal	12	342	August 2, 2023

How are scraps actually hashed?

Related topics