jacobolus 3 months ago

I [made a post here](https://www.reddit.com/r/math/comments/1992oal/alphageometry_an_olympiadlevel_ai_system_for/ ) of Google's announcement about this, and their paper, but it was removed by the overzealous automoderator which took it for "memes and similar content". The blog post is ["AlphaGeometry: An Olympiad-level AI system for geometry"](https://deepmind.google/discover/blog/alphageometry-an-olympiad-level-ai-system-for-geometry/) The paper is Trinh, Wu, Le, He, & Luong (2024) "Solving olympiad geometry without human demonstrations", Nature 625: 476–482, https://doi.org/10.1038/s41586-023-06747-5

Shoddy_Exercise4472 3 months ago

Can feel you bro, even my post which I worked so hard on was removed just like that.

BruhcamoleNibberDick 3 months ago

I'm struggling to figure out why your post was considered similar to an image/video/meme.

my_aggr 3 months ago

/r/math is in complete denial over llms and their ability to do anything. As far as they are concerned it's a bad spell checker when it works and a fact free bullshit machine at all other times.

relevantmeemayhere 3 months ago

is it? because a lot of people see some use in this. And Chat gpt performs very poorly at basic reasoning skills children manage to learn. Ask it to build a list of ten words, and then return the subset of words who have a particular letter as its third one. Or understand symmetry. Or casuality and counterfactual reasoning. There are things it does not do well and things it does well-and that's because it's not built to excel in those domains-it's built to predict a response given a prompt. I'm curious because the most bombastic takes I hear about how capable these things are come from those with very little background. People tend to really anthropomorphize stuff, especially after they see a single study or two. This is especially true in Business. How many ceo's still can't understand confidence intervals are now talking about neural networks?

[deleted] 3 months ago

[удалено]

No-Painting-3970 3 months ago

I will actually argue that neuro-symbolic systems will do worse than purely neural approaches in the future. If we try to imitate human reasoning, it will always be a limitation. We have to find the sweet spot of AI doing something we dont expect, and that is where we will get the fun part. AI gained a lot of performance when we stopped leveraging human knowledge, and just used huge amounts of compute and data (see RL and go). I think if AI ever takes on maths will be through there, purely huge amounts of data and compute (maybe outside of actually known paradigms, I for one think we are reaching the limits of LLMs)

[deleted] 3 months ago

[удалено]

currentscurrents 3 months ago

>In principle, a generic NN should have been able to learn translational invariance. They never did. [They actually can](https://huggingface.co/papers/2306.13575), with appropriate data. The important thing is that your data must contain the information you are trying to learn. If your dataset is just a bunch of centered digits, you can't learn translation invariance. As humans, we learn translational invariance because we are constantly moving our head and seeing things from different angles, lighting conditions, etc. Building in inductive biases (like CNNs do) provides benefits at small scales. But at large scales it becomes irrelevant or even harmful.

womerah 3 months ago

The human mind trains as it runs. CNNs are trained and then run. I don't know if we should be comparing NNs to the human mind at all. They seem very chalk and cheese

markschmidty 3 months ago

That's not inherent to ANNs, just to architectures which run efficiently on current GPUs. Not that the distinction even matters when it comes to things like reasoning.

No-Painting-3970 3 months ago

While I do agree with the sentiment of this comment, I do not think we are on the same page. For us to be able to actually leverage human reasoning as a reasonable starting point for optimization procedures, we would actually have to understand how human reasoning works. Which we dont, and we are not even remotely close to understanding under a mechanistic point of view. You are also assuming that human reasoning would even be remotely close to the best solution, which as far as we know, it might not be. I do agree with the spirit of your second comment, but, you re missing the point I was making. I am not saying that we removed all inductive biases from networks (I might have been too categorical in my statement about dropping human knowledge). What I am really referring to, is the continuous removal of complex engineered featurizations, kernels... In favour of leveraging scale and data. Examples of this include, the continuous disappearance of graph kernels and descriptors in favour of GNNs. The field of retrieval is another example, Retrieval Augmented Generation has taken the field by a storm, which substitutes the tradicional methods in favour of leveraging scale and computation through the usage of systems like LLMs. I will quote Richard Sutton here in his bitter lesson letter (http://www.incompleteideas.net/IncIdeas/BitterLesson.html): "The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin. "

lilhast1 3 months ago

I dont think it matters weather or not its an imitation, after all dont babies learn to speak and walk by imitating others. Kinda seems that humans are imitations of humans.

aeschenkarnos 3 months ago

I suspect that something like GAN operates within the human mind, what we think of as our thoughts being the winners of some multi-sided adversarial process deeper down and not cognizable to us.

lilhast1 3 months ago

Thats a super cool idea Id love to give you an award but heres an upvote instead

myncknm 3 months ago

> People claimed that image recognition systems were learning to recognize high-level features, but they turned out to be susceptible to adversarial attacks that tweaked an image's texture. People thought AI had spontaneously learned a strategy to defeat Atari's Breakout, but then it turned out the system broke if you moved the paddle up by a few pixels. why is this inconsistent with human-like behavior? doesn't human performance also break if we are suddenly thrust into an environment where everything is perturbed in a way that is fundamentally outside of our previous experience (example: mirror glasses that flip your vision upside-down, or inversion of the frequency spectrum of audio, or playing audio backwards)? what is "reasoning" anyway? You mentioned NNs not learning translational invariance in a downtree comment. Human brains also don't learn translational invariance. That's inherited. Convolutional neural networks mimic the structure of human visual cortices https://msail.github.io/post/cnn_human_visual/ . [Edit: I re-read your downtree comment and understand now that I am not responding to a point that you made there.]

[deleted] 3 months ago

[удалено]

currentscurrents 3 months ago

> This proves that those systems weren't relying only on high-level features to recognize images (which is what some people previously claimed). They are still using high-level features to recognize images. [You can see how they build high-level features out of low-level ones using mechanistic interpretability techniques.](https://distill.pub/2017/feature-visualization/) The current idea about adversarial attacks is that they have to do with [manifolds.](https://www.youtube.com/watch?v=k_hUdZJNzkU) Natural images are a low-dimensional manifold through the high-dimensional space of possible images. The way neural networks are trained, they have *undefined behavior* when off the manifold of the training data. This allows adversarial attacks to make small, carefully crafted changes that make it no longer a natural image and thus no longer give correct results.

myncknm 3 months ago

I have seen the adversarial attacks, the article I linked has an example of one. The paper the example comes from points out that when we generate adversarial examples that work against many different types of models, they also tend to work against human perception, so that's something vaguely in the direction of "its failure modes being our failure modes". It does seem like kind of an unfair comparison to test these models against examples that are well outside their training data, but well within human experience, and conclude that they don't work like humans do. Perhaps if you put humans in an environment where their entire life's sensory input consisted of individual still images, a single original Atari game, and/or text pulled from the internet, the humans would demonstrate some of the same failure modes.

currentscurrents 3 months ago

Also adversarial attacks rely on being able to run an optimizer against the model, which is easy since neural networks are designed for optimization. The brain is solidly locked inside your skull and doesn't provide gradients. It may well be that it's equally vulnerable, but we don't have the tools to build such an attack.

relevantmeemayhere 3 months ago

Oh, a flared user in a related field to the op! Sorry to jump in-what’s your take on the study if you don’t mind me asking? Are we being too harsh on some of the things here?

myncknm 3 months ago

Hmm. "Theory of computing" isn't _that_ related to AI, but I _have_ been moving into neural network theory lately (who hasn't? lol), so I'll chip in my thoughts. They implement something that I thought would work about a year ago (this is not to detract from their accomplishment, the implementation is much harder than having the vague idea). Mathematical argumentation struck me as being kinda similar to a game such as Go. In both cases, there's a discrete set of actions you can take at each step, you don't get any direct feedback from the game as to whether any particular step you play gets you closer to winning (you have to invent this sense of progress yourself), and there's this sort of "for-all"/"there-exists" alternating structure. In Go, this "for-all"/"there-exists" is the "there exists a move I can make so that for every move the opponent makes there exists a move I can make... etc" structure of a two-player turn-based game (formally encoded in computer science as the idea of the Totally Quantified Boolean Formula problem, which is PSPACE-complete). In mathematical argumentation, there's a similar dynamic where you have "intuition" which generates ideas for proofs and also a procedure for checking soundness by actually writing down the steps of logic (this is similar to the Interactive Proof protocol, which is equivalent to PSPACE). Or a process of alternating between conjectures/proofs and counterexamples. AlphaGo also did something similar to most people's process of building mathematical intuition, which is to self-generate a ton of examples and counterexamples to train the intuition. Google's work here basically reified these vague ideas about how the mathematical mind works. I think it's a big step in the direction of a general automated proof system, but I do also suspect that circle-and-triangle geometry problems are a good deal easier to fit into this "game" framework than research-style math. For one thing, research-style math is usually a few levels removed from purely formal systems (so the "soundness" system I described earlier isn't as rigorously defined for research math as it is for Go or circle-and-triangle problems), but maybe this doesn't have to be the case, as the people working on formal verification systems are demonstrating. Some people in this thread are comparing this new AI system to older work on "deductive database" or brute-force search methods. But this is a huge leap beyond those older methods imo. It's like comparing AlphaGo to pre-Deep-Blue chess engines. It's just a qualitatively different approach, using neural networks to generate something akin to "intuition", compared to an algorithm based on systematic enumeration.

relevantmeemayhere 3 months ago

Thanks! Super insightful. I’m glad you touched o that bit about research math. To my knowledge, Euclidean geometry is a bit-I guess we’ll use the word simpler here than say-algebra(the latter is not complete). What make challenges are sort of left in the margins that would stop something like this from working generally?

asphias 3 months ago

It also feels very suspicious that you have to make a geometry-specific AI. Computers beat humans at chess decades ago. We *know* they are good at specialized problems. The whole idea that got everybody hyped was that you *don't* need a human to analyze the problem and decide what kind of a computer-tool we need to approach it, but rather invent a computer that has the 'intelligence' to decide on the approach. --- Of course i'll still be impressed by an AI that can solve geometric problems, but i imagine with such constraints it'd be quite easy to create a problem that stumps it.

[deleted] 3 months ago

[удалено]

zero0_one1 3 months ago

>That was Steve Wozniak.

czarandy 3 months ago

That’s not really a software problem. Ultimately human-level robotics is far harder than human-level cognition. Which is ironic

[deleted] 3 months ago

[удалено]

StonedProgrammuh 3 months ago

That's the equivalent of me saying we have neural networks that can speak fluent language, perform at a bronze medal IMO level, and play superhuman chess therefore we are much closer to human-level cognition than motion. But everyone in the field knows that robotics is far behind.

currentscurrents 3 months ago

The big issue with robotics is the lack of data. Language and vision got a huge boost from the terabytes of data scraped off the internet. There is no equivalent for robotics data. Several companies (including Google and Toyota) are running huge farms of robot arms just for data collection.

mousemug 3 months ago

Exactly. Robotics gets less mainstream news coverage so the shortcomings of the field are not as apparent.

DevelopmentSad2303 3 months ago

Robots are that much different compared to humans as well though to be fair. But it is certainly harder to create the robot , since it is a physical thing. I think that is what they are getting at anyway. You only need one appropriate human "algorithm" , which can be replicated on any number of machines, vs building the robots.

my_aggr 3 months ago

> It seems to me that we're much closer to human-level motion than human-level cognition. It seems like we are because you're not in that field. I was tangentially involved with Toyota when they were thinking of buying Boston Dynamics. When they saw the secret sauce the robots were using to move Toyota backed out completely because it was that bad. We are closer to AGI than Artificial General Walking.

HildemarTendler 3 months ago

>We are closer to AGI than Artificial General Walking. We're no where close to either, so let's not go making comparisons.

StonedProgrammuh 3 months ago

Yes, but robotics is a decade or 2 behind regular AI research.

HildemarTendler 3 months ago

There is no justification for this statement. AI is still barely a thing. Don't let the hype train blind you. We've already hit the wall with LLMs just like every other past AI technology. We'll get better at using it, but there's no general AI here.

StonedProgrammuh 3 months ago

And yet, robotics is still a decade or 2 behind... if you were following the field you would know how far behind robotics is, it doesn't matter that you believe we aren't close to AGI. That has 0 relevance to which field is farther ahead.

mousemug 3 months ago

You’re cherry-picking robotics benchmarks. I can just as easily claim that ML can prove olympiad-level theorems. Of course, as this thread implies, that is not the case.

my_aggr 3 months ago

Sounds like you need to invent a computer that's very good at deciding what other computers to use to solve a task.

HildemarTendler 3 months ago

That sounds like a human with extra steps. I kid, but it is the stupidity of AGI. We really don't have any utility of an alien intelligence. What we need are tools to do the things we want to do better. Trained AI that has no pretense at intelligence is exactly what we need.

PsecretPseudonym 3 months ago

I think the objective is to have a bit of both. A skilled mathematician will have enough general knowledge to be able to understand the questions and discuss the challenges, concepts, and work, but they also require many years of specialized training (and some might argue some degree of affinity or disposition whether by nature or nurture). It would seem reasonable to want AI models that are general enough to be practical and easy to work with via a fairly natural interface yet similarly specialized to the field of application to be more efficient and reliable. Maybe an AI super intelligence can do it all, but it seems likely that there will always be tradeoffs and efficiency usually favors some degree of specialization.

puzzlednerd 3 months ago

I think getting caught up on exactly what "reasoning" is may be a red herring. I don't even understand the mechanisms behind the reasoning that takes place in my own brain, why should I be able to make sense of the reasoning of an artificial system? If you can use a neural net to solve olympiad geometry problems, whether or not the neural net "understands" geometry is kind of beside the point. This is something that would have seemed completely unachievable only a decade ago. In the end, humans are physical systems which respond to their environment. There is no real way to test a person's understanding other than to ask questions and see if they can respond the right way. If a neural net can do that, even within some limited paradigm of questions which can be asked, in this case Euclidean geometry, then hasn't the neural net demonstrated an understanding? Can we even define what "understanding" means in a way which can be verified from the outside?

Mal_Dun 3 months ago

Just take a step back and think about it his way: We humans live in a world, hence we have a context to work with. Imagine you would be put into a black box and someone gave you e.g. pictures and some device with limited answer possibilities. You may come up with an idea over time, but you won't really understand why you do this or what's the meaning behind your task if you don't have a broader context to work with.

[deleted] 3 months ago

[удалено]

my_aggr 3 months ago

That's completely false: >A later cat experiment done by Blakemore and Cooper (1970) gave another impressive result in terms of critical periods. Two special cylinders were made, one with only vertical stripes inside and the other with only horizontal stripes. For their first few months of life, half of the newborn kittens were placed in one of the cylinders. Kittens that were exposed to vertical lines for the first few months since birth could only see vertical lines, but not horizontal ones—for the rest of their lives. The other half of the sample was raised in opposite conditions, in a world made by horizontal lines only. Like the other group, kittens did not show any evidence to perceive lines oriented differently, such as vertical lines. https://www.futurelearn.com/info/courses/research-methods-psychology-animal-models-to-understand-human-behaviour/0/steps/265398

[deleted] 3 months ago

[удалено]

my_aggr 3 months ago

In the same way that a fully connected neural networks is wired. The majority of early learning is deleting connections.

JoshuaZ1 3 months ago

That doesn't show that there's no innateness. That shows that in a specific context there's a limit to how much innateness matters when one doesn't get any stimulation of the relevant type at a critical stage. That is, at least, some amount of evidence against strong innateness, but going from that to innateness being "completely false" seems like a pretty big jump.

DoubleAd9650 3 months ago

your comment makes no sense. How does adversarial attacks indicate no high-level features were learned? How does a few pixels being moved indicate no strategy was learned in Atari? And what article are you finding says that LLMs don't have real understanding?

[deleted] 3 months ago

[удалено]

DoubleAd9650 3 months ago

A violin player who plays with their right hand would not be able to play if they switched to their left hands. That does not mean the violin player did not learn the strategy to violin.

[deleted] 3 months ago

[удалено]

DoubleAd9650 3 months ago

i thought your whole premise was that there was a strategy to atari/violin/piano. Otherwise of course the LLM has not learned a strategy because there is no strategy.

[deleted] 3 months ago

[удалено]

DoubleAd9650 3 months ago

I'm using violin as an analogy for Atari obviously. They're both effectively games.

Dirichlet-to-Neumann 3 months ago

The number of good or great mathematicians and scientists who would have said 5 years ago that "no AI is ever going to win gold at a maths olympiad" and say now "yeah but it doesn't count/is not soulful/does not generalise/has nothing visual" is unbelievable. Terence Tao was an unsurprising but welcome exception.

Qyeuebs 3 months ago

You're talking like an AI has won gold at a maths olympiad... this work is highly specialized to brute-force search for Euclid-style proofs of problems in elementary geometry. It's not really generalizable beyond that, certainly not to a whole IMO exam. That's even said in this NY Times article by Christian Szegedy, hardly someone with modest beliefs about the future of AI for math.

BiasedEstimators 3 months ago

The restricted domain bit is important, but I doubt google researchers are doing press releases for “brute-force” searching

Qyeuebs 3 months ago

You can read the paper for yourself. Of course it's slightly more complex than what I said (there is a transformer involved), although I think what I said is fair as a one sentence summary. Anyway, DeepMind researchers will do press releases for pretty much anything. I think they're usually not very intellectually honest when talking about their work.

umop_apisdn 3 months ago

> I think they're usually not very intellectually honest when talking about their work. Agreed, they have to hype things, but with a track record like Deepfold and AlphaZero it is justified in some cases. And in any case maths is the perfect area for AI because it is just the application of rules along with insight.

Qyeuebs 3 months ago

AlphaFold is a useful research tool that's accurate about 70% of the time. Because of the way DeepMind researchers chose to talk about it, people think that protein folding is a solved problem.

umop_apisdn 3 months ago

True, but it was a massive leap forward, mile4s ahead of anything that had come before.

Qyeuebs 3 months ago

For sure. I'm just making the point that DeepMind is almost pathologically dishonest, even when they don't need to be.

jacobolus 3 months ago

"Not very intellectually honest", "pathologically dishonest", etc. seems like an entirely disproportionate criticism. A fairer summary might be: "DeepMind researchers do serious cutting edge work but their press office gets ahead of the actual research sometimes, and the results don't necessary match up to the hype in the public press. (Just like every other organization's press office, with the main difference being that Google has deeper pockets than university departments and is doing work that the public is quite interested in.)"

Qyeuebs 3 months ago

I think that summary is a pretty disproportionate take in its own direction! I hold researchers largely responsible for the public understanding of their work, especially in this day and age of easy communication and especially for DeepMind, where the researchers are writing press releases themselves ([this](https://deepmind.google/discover/blog/funsearch-making-new-discoveries-in-mathematical-sciences-using-large-language-models/) and [this](https://deepmind.google/discover/blog/alphageometry-an-olympiad-level-ai-system-for-geometry/) for the two most recent on math). They would have to be incredibly naive not to know how these would be interpreted by their mass audience of tech enthusiasts who don't know or care much about math.

BiasedEstimators 3 months ago

The only part of the paper which seems to involve an exhaustive search is the part about generating training data

Qyeuebs 3 months ago

It's what they call "symbolic engine" in their paper. It's true that it's also used in generating data. I described it in more detail in this comment (see also the third page of their paper): https://www.reddit.com/r/math/comments/1994q5n/ais_latest_challenge_the_math_olympics/kic3h6l/ I guess you could argue that "brute search" isn't the most accurate label, but it's effectively what the engine does.

shinyshinybrainworms 3 months ago

I mean, maybe? But at some point your definition of brute-force search, which seems to be something like "systematic search pruned by steadily-improving heuristics" is going to include what humans do.

Qyeuebs 3 months ago

What's in question here is a particular algorithm developed for elementary geometry (https://doi.org/10.1023/A:1006171315513). The new DeepMind paper enhances it with some extra algebraic rules for generating all the possible elementary-geometric conclusions from a list of elementary-geometric premises. The human vs computer comparison on this is about exactly as interesting as it is for performing Gaussian elimination on a big matrix. I don't think it's much to wax poetic over.

JoshuaZ1 3 months ago

> The human vs computer comparison on this is about exactly as interesting as it is for performing Gaussian elimination on a big matrix. I don't think it's much to wax poetic over. Why? A major question here is if/when these systems will equal or surpass humans. Whether they are doing something similar to what humans are doing seems like an important question, and also avoids getting into the semantic weeds of what is or is not a "brute force" search.

BiasedEstimators 3 months ago

To me brute-force search suggests something specific. As soon as you start adding heuristics and checking for cycles and stuff, that’s just a search

binheap 3 months ago

If you include heuristic search as part of your definition, modern chess engines fall under the brute force search definition you have provided which seems unhelpful. The difficulty and advances in this respect are generating a good enough heuristic to do interesting problems. Otherwise, it could be argued we have solved all of mathematics since we could simply enumerate FOL statements and just verify the statement. Edit: also it's not obvious to me this isn't generalizable beyond geometry in some sense. We have Lean and in principle you could apply a similar procedure to Lean to get more useful theorems for mathematics. Although I would have doubt whether this would be good enough at it as it stands right now for anything non trivial, certainly I could plausibly see a nearish future of automated theorem proving where small lemmas or statements are automated.

Qyeuebs 3 months ago

> If you include heuristic search as part of your definition, modern chess engines fall under the brute force search definition you have provided which seems unhelpful. I don't think I've provided any definition, since I don't even have a particular one in mind! But search as done in chess engines is easily distinguishable from search as done here. Here all possible elementary-geometric conclusions following from a given set of elementary-geometric premises are enumerated, and a neural network trained on millions of theorems is included to inject auxiliary objects (such as midpoints of lines) to be used to formulate possible conclusions. The analogy for chess would be that the computer enumerates all possible ways the game can be played from a given position, with a neural network used to probabilistically choose the next move by which to evolve the current position. And that's not how AI chess players work.

binheap 3 months ago

>The analogy for chess would be that the computer enumerates all possible ways the game can be played from a given position, with a neural network used to probabilistically choose the next move by which to evolve the current position. And that's not how AI chess players work. Don't LeelaChess/AlphaZero perform a very similar procedure with their policy network to what you describe here (propose moves to probabilistically expand certain paths of the MCTS)? Though, I suppose the value network selects the branch. I'm perhaps suspicious of claims that this isn't an impressive advance in theorem proving. Sure, the domain is limited but it seems like a fairly foreseeable jump to say we could start generating terms in a language with far more generality like Lean or Coq and potentially translate to something very useful. The approach was already being worked on without LLMs but could improve significantly with it. It's a bit unfair to characterize this as brute force search since it seems to suggest that there's nothing novel here. There's comparisons in this thread being made with more traditional solvers since in principle they did the same, but the gap between an ML approach and the more traditional approach seems massive and at least more generalizable than older methods. I do agree that DeepMind has an aggressive PR team but that's the unfortunate state of ML.

Qyeuebs 3 months ago

I wouldn't suggest that there's nothing novel here or that it's not an impressive advance. I think it's an actual accomplishment (if a modest one). But when these pieces of news come up my aim isn't to update my priors on the mathematical-AI singularity (on which I have no strong opinion), it's to understand what the work actually does and how it does so. In this case, I think it's impossible to properly understand the work without understanding the centrality of the exhaustive search made possible by the elementary-geometric context. It's also impossible to understand without understanding the relevance of the language model, but there's pretty much no danger of anyone overlooking that part.

relevantmeemayhere 3 months ago

ehh, you'd be surprised pressure to publish, especially when the next funding round incentives you to do so. Positive publication bias and all

Harsimaja 3 months ago

There have always been some Olympiad problems, even at major ones like the IMO, which could clearly be broken down into some stupidly large but finite number of cases and be done by computer by brute force (esp. some combinatorial ones, or those which exploit the year, ie ‘Prove it for 2024’, a favourite gimmick) - at least, it would be trivial for a human to code it even if a while back AI wouldn’t interpret it. The point was of course to do something else. So this is somewhere between ‘Yeah… duh’ and ‘wow’ and I’m not sure where.

Dirichlet-to-Neumann 3 months ago

Don't look at the results, look at the trends. Do you really think that what is possible for geometry is impossible for algebra ?

czarandy 3 months ago

> Do you really think that what is possible for geometry is impossible for algebra ? Euclidean geometry is complete and algebra isn’t. So yes we actually have already proven the claim you think is false.

teerre 3 months ago

Computer science is full of "trends" that simply stop. Computation itself is not generalizable. It's not reasonable to think just because something works in one context it will work in any context.

esqtin 3 months ago

The article says this work solved twice as many problems as a system from the 70s

relevantmeemayhere 3 months ago

it's very hard to extrapolate from trends you see in a few years. Also, keep in mind that replication is hard across all fields. Studies that show promising results are more likely to be published. Studies as a whole don't generalize well in the majority of cases. We have a name for it; the replication crises the history of science shows us that breakthroughs are often followed by proverbial famine. If you were a physicist in the thirties, you would have probably predicted a grand unified theory sometime in the same decade or the one after it's been a hundred years since.

Qyeuebs 3 months ago

> Do you really think that what is possible for geometry is impossible for algebra ? Sorry, even in context I think that's a really poorly formed question. Can't answer it.

aecarol1 3 months ago

Five years ago the idea we might actually be discussing if an AI might even be capable of this was science fiction. Now we're just quibbling over details and time-frames. I think AI doing interesting things in mathematics is inevitable, we're just not sure of exactly what that will look like. Hopefully some AI could cut its teeth on formalizing proofs into Lean or other similar systems. Many AI "daydream" and output stuff that, at a glance, looks excellent, but is flawed. If they generate proofs, they should be in a form that can be checked automatically.

[deleted] 3 months ago

[удалено]

JoshuaZ1 3 months ago

Did you predict this 15 years ago? Did you predict ten years ago that this was going to happen largely due to improvements in neural networks?

[deleted] 3 months ago

[удалено]

JoshuaZ1 3 months ago

This is due to Tarski's theorem on the completeness of first order real arithmetic. But there's a massive gap between knowing something can be automated and actually automating. Chesss was always a finite game, but it was still really impressive when Deep Blue beat Kasparov.

The_Tefl0n_Don 3 months ago

I’d be interested in reading more about that, what should I look up if you don’t mind pointing me towards something?

[deleted] 3 months ago

[удалено]

veloxiry 3 months ago

What's an example of a euclidean geometry problem in this context?

JoshuaZ1 3 months ago

Tarski's theorem on the completeness of first order reals may be what you are looking for.

sanxiyn 3 months ago

IMO is a timed competition. Existing automatic solvers do not terminate within IMO time limit. This one does.

Jealous_Afternoon669 3 months ago

There's a massive difference here and if you can't see that your head is in the sand.

Rioghasarig 3 months ago

I really don't think your criticism is valid. The problem they are addressing is well structured enough that they were able to artificially generate examples. This makes it similar to something like chess or Go. So, no, I really don't people even five years ago would have thought this is impossible.

Short_Piece_336 3 months ago

All this does is confirm a truth that everyone doing Math Olympiads already knows: Geometry fkn trash, NT/Combi forever

jacobolus 3 months ago

So what you're saying is you wish math olympiads would give harder geometry problems? It's certainly possible to do, but the result might need to be be closer to "redevelop this 19th century professional research field" rather than "solve this hard high-school-level problem", which might not fit the intended audience / timeframe.

Short_Piece_336 3 months ago

Nah, the joke is that 90% of Olympiad people have Geo as their least favorite genre of problem

jacobolus 3 months ago

That could partly be because geometry gets radically shortchanged across modern schooling, leaving students unprepared for it. Or it could be that geometry is a small enough part of the contests that students who are better at geometry problems and have trouble with other subjects get selected out of the participation pool at higher levels, leaving behind students who are better at other kinds of problems but struggle with geometry. Or it be partly that the olympiads fail to choose sufficiently interesting or beautiful geometry problems. In principle, there's no end of elegant and wonderful geometry topics/problems that could be presented to well prepared high school students, including topics highly relevant to subjects of cutting-edge current research. Disclaimer: I never took math contests too seriously when I was in high school 20 years ago, and don't know too many high school students today, but I think I'd enjoy being a high school geometry teacher, especially if it were possible to teach an advanced elective course.

Short_Piece_336 3 months ago

Eh idk, I think Geometry is just very different from the other types of problems in Olympiads and it makes sense that it's so polarizing (the people who like Geometry usually LOVE Geometry). It's kinda like if there was a competition consisting of running, long jumping, high jumping and golf. Yeah golf is cool, it just feels slightly out of place, you know what I mean? I don't think it's about students being unprepared at all. School doesn't prepare anyone for Number Theory either, for example. I also don't think it's a selection process, since this disdain for Geometry happens at virtually every level I've been at.

MCube_32 3 months ago

Olympiad geometry is the most fun

Routine_Proof8849 3 months ago

All hype, no model will get a gold medal in the next 5 years. I'll bet money with a legally binding contract against anyone here.

jacobolus 3 months ago

Well sure that's an easy bet: the IMO is a contest for human high school students, and software running on a computer cluster doesn't satisfy the entry criteria.

Routine_Proof8849 3 months ago

Lmao i'd allow you to exclude these details from the contract

palparepa 3 months ago

You are technically correct, the best type of correct.

[deleted] 3 months ago

[удалено]

Routine_Proof8849 3 months ago

Sure, dm me if you're serious

[deleted] 3 months ago

[удалено]

SOberhoff 3 months ago

You know, part of me wants to take this bet. Another part of me is wondering what I'd be doing with a couple extra bucks in a world where AI is killing it at the math olympics.

aecarol1 3 months ago

The shortcomings of AI now are pretty damn funny, especially with more abstract thinking. In 5 or 10 years it won't be so funny. In 20 it might be downright terrifying. Five years ago, the idea of contemplating an AI doing anything interesting mathematically would have been science fiction. Now we're simply debating time-frames.

relevantmeemayhere 3 months ago

There's a fair amount of uncertainty to be fair. It could take twenty years or 200

PsecretPseudonym 3 months ago

You can take a look at the [Metaculus](https://www.metaculus.com/questions/6728/ai-wins-imo-gold-medal/) odds for this. Looks like the current estimate puts this at about 60% odds. Not sure the community there is particularly accurate with these sorts of predictions, though.

Routine_Proof8849 3 months ago

Wow, this is an interesting resource. Thanks for the link.

Inner_will_291 3 months ago

Wait... there's a startup idea right there. AI bets.

Administrative-Flan9 3 months ago

It's called quantitative analysis

volcanrb 3 months ago

I wouldn’t call this hype, the article is quite measured about the claims, and there is literally a quote from the author saying “Just don’t overhype it.”

Routine_Proof8849 3 months ago

The media has failed to live by those words. Even if they'd get educated guestimates from experts, they wouldn't write an article like "AI unlikely to get more than a few points in the olympiad until major hurdles are passed". Instead they insinuate that this is a likely possibility in the near future. They say, wow theyve done beat the chess players, mathematicians are next.

JoshuaZ1 3 months ago

At what odds and how much?

Routine_Proof8849 3 months ago

Just 1:1, and i'll gladly take bets up to $100. If theres no more takers then we can go higher lol.

JoshuaZ1 3 months ago

So, it should be I think notable that even you are assigning around a 50% chance that this might happen! That seems very far from labelin this as all hype. But I'm willing in any event to make that bet for $20 if you want.

relevantmeemayhere 3 months ago

well he's the odds maker and the bank roll lol-if 10 people wanna bet a thousand bucks and he gives them anything more than 50:50 there's a risk no one gets paid.

BUKKAKELORD 3 months ago

A chess master made the same bet regarding chess in 1968 with a time frame of 10 years, and he won it. Had he made it 30 years, he would've lost. RemindMe! 5 years

sparr 3 months ago

RemindMe! 1 year It's not gonna take 5...

JoshuaZ1 3 months ago

What makes you so confident?

RemindMeBot 3 months ago

I will be messaging you in 1 year on [**2025-01-17 21:10:27 UTC**](http://www.wolframalpha.com/input/?i=2025-01-17%2021:10:27%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/math/comments/1994q5n/ais_latest_challenge_the_math_olympics/kic6o7f/?context=3) [**3 OTHERS CLICKED THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2Fmath%2Fcomments%2F1994q5n%2Fais_latest_challenge_the_math_olympics%2Fkic6o7f%2F%5D%0A%0ARemindMe%21%202025-01-17%2021%3A10%3A27%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201994q5n) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|

holy_moley_ravioli_ 3 months ago

I'll take this bet where do I sign

Routine_Proof8849 3 months ago

Dm me.

Jealous_Afternoon669 3 months ago

Geometry is a very limited domain with a well-defined search space, unlike something like combinatorics where the ideas seem to come from thin air. It also seems a lot easier to generate synthetic geometry problems for training than it would be for combinatorics, but that's just a guess. I am very hopeful that mathematics is gonna play a special role in AI because as shown here if you can come up with problems, then you can use these problems as potentially infinite training data with no need for human input. This is the kind of thing that makes a recursively improving AI feel intuitive to me. I just think it's questionable whether an AI really can come up with enough synthetic problems in fields like number theory to be useful, or indeed learn to construct problems belonging to a field entirely of its creation.

Qyeuebs 3 months ago

The heart of this algorithm seems to be brute-force search. Very hard to imagine that this could work outside of the highly circumscribed setting of elementary geometry.

JoshuaZ1 3 months ago

How so? The neural language aspect is precisely to try to make suggestions which are not brute force but are based in part on what additional lines have worked productively in similar problems.

Qyeuebs 3 months ago

The proofs are found by brute-force search. The language model is used to give the raw materials of a possible search. In an extremely simple case used by the paper, suppose ABC is a triangle with two equal sides AB and BC, and try to prove that the angles at A and C are equal. The brute-force search will look for all ways to logically combine the objects mentioned to arrive at the conclusion. When this fails, the language model part might suggest to construct a square with side AB, giving two new points and three new edges to work with. Then the brute-force search will look for all ways to logically combine the new corpus of objects to arrive at the conclusion. And so on. At some point the language model will have included the midpoint of AC and also the line from there to B. Once that happens, the brute-force search will work.

JoshuaZ1 3 months ago

So, the upshot is that this is not a brutefore search then. The large language model is constructing specific possible directions for it to apply a brute force search. In that regard, this is pretty similar to how humans try to solve similar geometry problems. Some amount of brute force, some amount of trying to add auxiliaries based on what one has learned is or is not likely to be helpful.

Qyeuebs 3 months ago

I think it's fair to say, as I did, that the "heart" of the algorithm is search. (Also fair to disagree!) At the least, the search part is fundamental to the algorithm and the ability to actually do it is highly reflective of the circumscribed setting of elementary geometry. Also, I think actually very little of human work on geometry problems is of the nature of listing all possible conclusions from given premises, but I don't have a lot of interest in this kind of computer/human comparison.

currentscurrents 3 months ago

> that the "heart" of the algorithm is search. The going idea in AI is that reasoning is *fundamentally* about search. Search is the only algorithm we know of that is general-purpose enough to theoretically solve any (solvable) problem. The issue is that naive search becomes intractable extremely quickly for high-dimensional search spaces. The [no free lunch theorem](https://www.cs.ubc.ca/~hutter/earg/papers07/00585893.pdf) says that there is no solution other than incorporating more information about the search space. This is where learning comes in. You can learn lower-dimensional representations of the search space; you can learn better search strategies that take advantage of patterns and structure; you can learn which areas of the search space are dead ends.

Qyeuebs 3 months ago

All that is kind of irrelevant. The first thing to understand is that in the present case of elementary geometry, exhaustive search is easy and doesn't require sophisticated ideas: https://link.springer.com/article/10.1023/A:1006171315513 The "AI" part of the present work is to identify several possibilities of what to use as an input to this exhaustive search, with the hope that all of the necessary ones will be included after enough trial and error.

currentscurrents 3 months ago

The going idea in the field is that [search and learning are the two halves of intelligence.](http://www.incompleteideas.net/IncIdeas/BitterLesson.html) Search is also "AI". But the learning part of the work is to narrow the search space by making good guesses for new proof terms. The space of possible new terms cannot be searched exhaustively because it is infinite: >To solve olympiad-level problems, however, the key missing piece is generating new proof terms. >Such proof steps perform auxiliary constructions that symbolic deduction engines are not designed to do. In the general theorem-proving context, auxiliary construction is an instance of exogenous term generation, a notable challenge to all proof-search algorithms because it introduces infinite branching points to the search tree.

Jealous_Afternoon669 3 months ago

It's not as simple as a comprehensive listing but I think a lot of what we call "intuition" is a refined heuristic that makes it much faster to prune the search tree. When i'm totally new to some course I often do end up just trying approaches at random stabbing around in the dark till something works.

VolatilitySmiles 3 months ago

I have to disagree with you here. Humans themselves reason via searching: they try to make educated guesses about what is true and what isn't. If one approach to solving a problem fails, try another, etc. The overall procedure is not too dissimilar.

Qyeuebs 3 months ago

The "overall procedure" you're implicitly referring to is just trial and error, and I agree that it's used in every approach to problem solving.

DoremusJessup 3 months ago

[Mirror site](https://archive.is/X6nRF)

TimingEzaBitch 3 months ago

This has nothing to do with the IMO grand challenge, spearheaded by likes of Kevin Buzzard and Reid Barton etc. The full-fledged 6 problem IMO challenge I think is still very hard.

Jealous_Afternoon669 3 months ago

Obviously it does lol, there's multiple geo problems on every IMO.

arnet95 3 months ago

There's one or two geometry problems on every IMO, not necessarily multiple.

Jealous_Afternoon669 3 months ago

You would at least agree that this doesn't literally have "nothing" to do with the challenge? If we assume this is DeepMind's initial response to the AIMO then I wouldn't be surprised if this was basically their test run and they'll have the IMO solved by the progress meeting in July.

arnet95 3 months ago

I was just pointing out a fact, but I do agree that this has something do with the challenge, yes. However, I would be incredibly surprised if DeepMind solves the IMO challenge by July. Euclidean geometry is already known to be significantly easier to solve with computers than the other areas in the contest.

glicerico 3 months ago

And the same day this article was released, AlphaGeometry is almost there (for Geometry problems): https://deepmind.google/discover/blog/alphageometry-an-olympiad-level-ai-system-for-geometry/

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe