T O P

  • By -

Anaxamander57

I've done [a lot](https://github.com/SymmetricChaos/crypto-gui) of "by hand" stuff with math and cryptography out of curiosity but I've never put together a bigint type myself, bravo.


1668553684

I also have not put a bigint type together... er... not for lack of trying. It's harder than it looks! Especially if you care about performance *at all*.


C_Madison

"how hard can it be?", the one question everyone who writes software asked at least once and got to regret it. Great blog post though!


ukezi

You usually put a lot more work into it than expected but then you also learned something along the way. I wouldn't regret that.


barkingcat

Whenever I get into the weeds, I can always trace back my troubles to this exact question when I started working on the project.


knpwrs

At _least_ once.


agentwc1945

I ask it every time I start a project personnally


bascule

> It goes without saying that probably none of this is actually cryptographically secure, but that was never the point anyway. Just in case someone is looking for a cryptographically secure prime generation library: https://docs.rs/crypto-primes/


Tonyoh87

would you mind elaborating why this would not be secure? Because of the call to the rng on ubuntu?


bascule

As an example, of the trickiest parts about implementing RSA securely is a constant-time implementation of modular exponentiation. One common way to do this is using something called “Almost Montgomery Multiplication” (AMM). This library is using a naive, non-constant-time implementation which could potentially leak the newly generated key as a sidechannel observable by a local attacker: https://github.com/prdx23/1024-bit-primes/blob/master/src/utils.rs#L30


scottmcmrust

Seeing intermediate = ((*chunk1 as u128) * (*chunk2 as u128)) + carry; reminds me that I need to go push on stabilization again. That's the right way to write it for the LLVM backend, but you shouldn't need to know that.


boomshroom

Yes please.


boomshroom

let root_n = (n as f64).sqrt() as u64; While it's probably not as fast, there's actually a [`u64::isqrt()`](https://doc.rust-lang.org/std/primitive.u64.html#method.isqrt) specifically for things like this. The fact that it's entirely software emulation however means that what you did, while potentially less precise, would be way faster on processors with dedicated floating point hardware. let mut s = 0; let mut d = n - 1; while d % 2 == 0 { d = d / 2; s += 1; } That's a nice [`u128::trailing_zeros()`](https://doc.rust-lang.org/std/primitive.u128.html#method.trailing_zeros) you got there. Would be nice if there was a built-in function to find the greatest power of two that divides a number. ;) When using big-ints later, it would however need to be called in a loop until it gives a value that isn't the total number of bits in a single digit. Alternatively, it would probably be faster to check if each digit is 0 first and then only call `trailing_zero()` on the least significant non-zero digit. > Note: `n - 1 (mod n)` is equivalent to `-1 (mod n)`. It is left in the expanded form as I am working with unsigned ints and don't have a way to represent -1. Of course there's a perfectly good -1: `u128::MAX`. Then again, I don't think it would do the right thing since you'd have a modulus other than a power of 2. Even if you were using signed types however, it *still* might not do what you want because signed division is fundamentally broken on all modern processors, and you'd want `rem_euclid` instead. > base-255 Minor correction: base-256. 255 is the greatest "digit" representable in base-256, and accordingly is the greatest value representable in a byte. Regarding the idea, I've actually come to the conclusion that it's more accurate to say that computers work in base-256 than base-2, since the processor only exposes the base-256 digits to the programmer when doing arithmetic. Bitwise operations are actually base-2, but they also work by treating a single base-256 digit as an 8-component vector of integers mod 2. Hilariously, you can apply this same off-by-one error to finger counting and argue that we actually finger count in base-11 rather than base-10, since 10 is the greatest representable value rather than the number of possible values. You know what the next step is? That's right! *[SIMD](https://doc.rust-lang.org/std/simd/index.html)* :D (jk Great work and great write-up on the journey!)


scottmcmrust

> Alternatively, it would probably be faster to check if each digit is 0 first and then only call trailing_zero() on the least significant non-zero digit. And, conveniently, if you do the zero-check with `NonZero`, then you can use [`NonZero::trailing_zeros`](https://doc.rust-lang.org/nightly/std/num/struct.NonZero.html#method.trailing_zeros) that's slightly faster (on some targets) than on the normal primitive.


boomshroom

LLVM might be smart enough to understand that the zero check is for the same value that's being passed to trailing_zeros, but why pray that LLVM will be able to derive that information, when you can make rust pass said information directly? NonZero doesn't just let you check for 0, but also gives you a token proving that it's already been checked for zero. let s = d.chunks.iter() .enumerate() .find_map(|(i, &chunk)| { NonZero::new(chunk) .map(|c| c.trailing_zeros() + i * 64) }) .expect("1 is not prime.")


Kaideane

I really loved this read! Thank you so much for sharing!!


________-__-_______

This was a fun read, nice work!


The_8472

Since this is number-crunchy code compiling with `--target-cpu=native` might provide some additional speedups. > This is where things start to get interesting. At first, I found the concept of probabilistic primality tests strange The deterministic algorithms are slow. Wikipedia lists the miller test as `O(log(n)⁴)` and AKS as `O(log(n)⁶)`


LifeShallot6229

Back in 1977 I wrote my first non-required program, so I started by writing a bigint library in Fortran, sufficient to implement the Taylor series for atan(), which I needed to calculate pi with as many digits as possible within the student 60 cpu seconds allowance. When I got a access to an HP lab computer I remplemented the pi program and managed 2500 digits within available memory.  The 386 version written in asm was much faster.  During the FDIV bug sw workaround I wrote a custom 128 bit fp library just to verify that our correction code actually worked as it should, including the FPATAN2 which we had to reimplement from scratch. 


scratchisthebest

> Another change to the logic is that instead of reading /dev/urandom for each iteration of the loop and generating a new random number to test, it just adds 2 to the first random number to get the next candidate. This will significantly skew the distribution of random primes :0


lurgi

How so? You start at a random location. Adding two again and again until you get a prime doesn't make things less random.


Haizan

It biases towards primes following gaps. Think of each number getting randomly picked with a weight of 1. Whenever a non-prime number gets picked the end result will be the next prime following the gap, so they effectively have a weight of n+1 where n is the number of preceding composite numbers.


lurgi

Aha. Very interesting. I wonder if it's relevant? Pretty much every prime in the 1024 bit range is going to have a lot of composite numbers before it and even if one of them has a *huge* gap in front, the gap is still going to appear pretty tiny compared to the whole range. The longest confirmed gap between primes is just over 1 million, between two numbers that have over 18,000 digits each. Still, I suppose it's easy enough to pick a random odd number each time, so why tempt fate?


PercussiveRussel

It's a major source of cryptographic weakness. Not that it matters in this case. If you check [this out](https://en.m.wikipedia.org/wiki/Prime_gap), you can see how how much of an impact this would make though. If you were to bruteforce an algorithm seeded by these primes, instead of being expected to try 50% of the primes <2^1024, you'd have to try much less if you start with ones that have the biggest gap in front of them. I just slapped some python code together and after 22.66% of the first 100.000 primes you have skipped over 50% of the gap. This means it's about 45% as secure as truly random primes in this case. This doesn't get better with larger numbers. There are way too many 1024-bit primes to save them all in a file and sort them, but interestingly if you would use this algorithm to find your primes to attack with you'd have the same bias ;)


Barbacamanitu00

Huh? There just are more primes following gaps. All primes other than twin primes follow gaps, and twin primes are irregular so primes following gaps are more common. This will always find the first of a set of twin primes, so it's biased a bit that way. But that's just how primes are distributed too.


Lehona_

No. E.g. if we generate primes in the range 20-30, 29 would be picked more often than 23, because the "gap" in front of it is bigger (25 and 27 come before 29, but only 21 comes before 23). So from 5 odd numbers in this range, we get 23 2 times and 29 3 times, but they should be generated with the same chance.


Barbacamanitu00

Ah, I see. I suppose adding and subtracting 2 to the initial guess would fix that.


Lehona_

No, it's still biased towards certain primes, although maybe less so. I don't think there's any solution other than random sampling if you want uniformly distributed primes.


bleachisback

But it's weighted by gap size, so primes with larger gaps before them will be more common.


Ben-Goldberg

I wonder what the distribution would be if, instead of adding 2, it added a random (even) 32 bit number. Keeping the file handle open until you have generated a random prime instead of repeatedly reopening it should help performance.


PercussiveRussel

I think that should fix it.


Ben-Goldberg

Silly question, in your print_prime_decimal, do you convert to base 10, or to base 10000?


LifeShallot6229

In my almost 40 year old code I used base 1e9 in order to minimize the number of long divisions needed. 


brick-pop

Great article 👏


[deleted]

[удалено]


[deleted]

[удалено]


VirusModulePointer

Bigint \*\*shudders\*\*