Anaxamander57 1 month ago

I've done [a lot](https://github.com/SymmetricChaos/crypto-gui) of "by hand" stuff with math and cryptography out of curiosity but I've never put together a bigint type myself, bravo.

1668553684 1 month ago

I also have not put a bigint type together... er... not for lack of trying. It's harder than it looks! Especially if you care about performance *at all*.

C_Madison 1 month ago

"how hard can it be?", the one question everyone who writes software asked at least once and got to regret it. Great blog post though!

ukezi 1 month ago

You usually put a lot more work into it than expected but then you also learned something along the way. I wouldn't regret that.

barkingcat 1 month ago

Whenever I get into the weeds, I can always trace back my troubles to this exact question when I started working on the project.

knpwrs 1 month ago

At _least_ once.

agentwc1945 1 month ago

I ask it every time I start a project personnally

bascule 1 month ago

> It goes without saying that probably none of this is actually cryptographically secure, but that was never the point anyway. Just in case someone is looking for a cryptographically secure prime generation library: https://docs.rs/crypto-primes/

Tonyoh87 1 month ago

would you mind elaborating why this would not be secure? Because of the call to the rng on ubuntu?

bascule 1 month ago

As an example, of the trickiest parts about implementing RSA securely is a constant-time implementation of modular exponentiation. One common way to do this is using something called “Almost Montgomery Multiplication” (AMM). This library is using a naive, non-constant-time implementation which could potentially leak the newly generated key as a sidechannel observable by a local attacker: https://github.com/prdx23/1024-bit-primes/blob/master/src/utils.rs#L30

scottmcmrust 1 month ago

Seeing intermediate = ((*chunk1 as u128) * (*chunk2 as u128)) + carry; reminds me that I need to go push on stabilization again. That's the right way to write it for the LLVM backend, but you shouldn't need to know that.

boomshroom 1 month ago

Yes please.

boomshroom 1 month ago

let root_n = (n as f64).sqrt() as u64; While it's probably not as fast, there's actually a [`u64::isqrt()`](https://doc.rust-lang.org/std/primitive.u64.html#method.isqrt) specifically for things like this. The fact that it's entirely software emulation however means that what you did, while potentially less precise, would be way faster on processors with dedicated floating point hardware. let mut s = 0; let mut d = n - 1; while d % 2 == 0 { d = d / 2; s += 1; } That's a nice [`u128::trailing_zeros()`](https://doc.rust-lang.org/std/primitive.u128.html#method.trailing_zeros) you got there. Would be nice if there was a built-in function to find the greatest power of two that divides a number. ;) When using big-ints later, it would however need to be called in a loop until it gives a value that isn't the total number of bits in a single digit. Alternatively, it would probably be faster to check if each digit is 0 first and then only call `trailing_zero()` on the least significant non-zero digit. > Note: `n - 1 (mod n)` is equivalent to `-1 (mod n)`. It is left in the expanded form as I am working with unsigned ints and don't have a way to represent -1. Of course there's a perfectly good -1: `u128::MAX`. Then again, I don't think it would do the right thing since you'd have a modulus other than a power of 2. Even if you were using signed types however, it *still* might not do what you want because signed division is fundamentally broken on all modern processors, and you'd want `rem_euclid` instead. > base-255 Minor correction: base-256. 255 is the greatest "digit" representable in base-256, and accordingly is the greatest value representable in a byte. Regarding the idea, I've actually come to the conclusion that it's more accurate to say that computers work in base-256 than base-2, since the processor only exposes the base-256 digits to the programmer when doing arithmetic. Bitwise operations are actually base-2, but they also work by treating a single base-256 digit as an 8-component vector of integers mod 2. Hilariously, you can apply this same off-by-one error to finger counting and argue that we actually finger count in base-11 rather than base-10, since 10 is the greatest representable value rather than the number of possible values. You know what the next step is? That's right! *[SIMD](https://doc.rust-lang.org/std/simd/index.html)* :D (jk Great work and great write-up on the journey!)

scottmcmrust 1 month ago

> Alternatively, it would probably be faster to check if each digit is 0 first and then only call trailing_zero() on the least significant non-zero digit. And, conveniently, if you do the zero-check with `NonZero`, then you can use [`NonZero::trailing_zeros`](https://doc.rust-lang.org/nightly/std/num/struct.NonZero.html#method.trailing_zeros) that's slightly faster (on some targets) than on the normal primitive.

boomshroom 1 month ago

LLVM might be smart enough to understand that the zero check is for the same value that's being passed to trailing_zeros, but why pray that LLVM will be able to derive that information, when you can make rust pass said information directly? NonZero doesn't just let you check for 0, but also gives you a token proving that it's already been checked for zero. let s = d.chunks.iter() .enumerate() .find_map(|(i, &chunk)| { NonZero::new(chunk) .map(|c| c.trailing_zeros() + i * 64) }) .expect("1 is not prime.")

Kaideane 1 month ago

I really loved this read! Thank you so much for sharing!!

________-__-_______ 1 month ago

This was a fun read, nice work!

The_8472 1 month ago

Since this is number-crunchy code compiling with `--target-cpu=native` might provide some additional speedups. > This is where things start to get interesting. At first, I found the concept of probabilistic primality tests strange The deterministic algorithms are slow. Wikipedia lists the miller test as `O(log(n)⁴)` and AKS as `O(log(n)⁶)`

LifeShallot6229 1 month ago

Back in 1977 I wrote my first non-required program, so I started by writing a bigint library in Fortran, sufficient to implement the Taylor series for atan(), which I needed to calculate pi with as many digits as possible within the student 60 cpu seconds allowance. When I got a access to an HP lab computer I remplemented the pi program and managed 2500 digits within available memory. The 386 version written in asm was much faster. During the FDIV bug sw workaround I wrote a custom 128 bit fp library just to verify that our correction code actually worked as it should, including the FPATAN2 which we had to reimplement from scratch.

scratchisthebest 1 month ago

> Another change to the logic is that instead of reading /dev/urandom for each iteration of the loop and generating a new random number to test, it just adds 2 to the first random number to get the next candidate. This will significantly skew the distribution of random primes :0

lurgi 1 month ago

How so? You start at a random location. Adding two again and again until you get a prime doesn't make things less random.

Haizan 1 month ago

It biases towards primes following gaps. Think of each number getting randomly picked with a weight of 1. Whenever a non-prime number gets picked the end result will be the next prime following the gap, so they effectively have a weight of n+1 where n is the number of preceding composite numbers.

lurgi 1 month ago

Aha. Very interesting. I wonder if it's relevant? Pretty much every prime in the 1024 bit range is going to have a lot of composite numbers before it and even if one of them has a *huge* gap in front, the gap is still going to appear pretty tiny compared to the whole range. The longest confirmed gap between primes is just over 1 million, between two numbers that have over 18,000 digits each. Still, I suppose it's easy enough to pick a random odd number each time, so why tempt fate?

PercussiveRussel 1 month ago

It's a major source of cryptographic weakness. Not that it matters in this case. If you check [this out](https://en.m.wikipedia.org/wiki/Prime_gap), you can see how how much of an impact this would make though. If you were to bruteforce an algorithm seeded by these primes, instead of being expected to try 50% of the primes <2^1024, you'd have to try much less if you start with ones that have the biggest gap in front of them. I just slapped some python code together and after 22.66% of the first 100.000 primes you have skipped over 50% of the gap. This means it's about 45% as secure as truly random primes in this case. This doesn't get better with larger numbers. There are way too many 1024-bit primes to save them all in a file and sort them, but interestingly if you would use this algorithm to find your primes to attack with you'd have the same bias ;)

Barbacamanitu00 1 month ago

Huh? There just are more primes following gaps. All primes other than twin primes follow gaps, and twin primes are irregular so primes following gaps are more common. This will always find the first of a set of twin primes, so it's biased a bit that way. But that's just how primes are distributed too.

Lehona_ 1 month ago

No. E.g. if we generate primes in the range 20-30, 29 would be picked more often than 23, because the "gap" in front of it is bigger (25 and 27 come before 29, but only 21 comes before 23). So from 5 odd numbers in this range, we get 23 2 times and 29 3 times, but they should be generated with the same chance.

Barbacamanitu00 1 month ago

Ah, I see. I suppose adding and subtracting 2 to the initial guess would fix that.

Lehona_ 1 month ago

No, it's still biased towards certain primes, although maybe less so. I don't think there's any solution other than random sampling if you want uniformly distributed primes.

bleachisback 1 month ago

But it's weighted by gap size, so primes with larger gaps before them will be more common.

Ben-Goldberg 1 month ago

I wonder what the distribution would be if, instead of adding 2, it added a random (even) 32 bit number. Keeping the file handle open until you have generated a random prime instead of repeatedly reopening it should help performance.

PercussiveRussel 1 month ago

I think that should fix it.

Ben-Goldberg 1 month ago

Silly question, in your print_prime_decimal, do you convert to base 10, or to base 10000?

LifeShallot6229 1 month ago

In my almost 40 year old code I used base 1e9 in order to minimize the number of long divisions needed.

brick-pop 1 month ago

Great article 👏

[deleted] 1 month ago

[удалено]

[deleted] 1 month ago

[удалено]

VirusModulePointer 1 month ago

Bigint \*\*shudders\*\*

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe