The Singularity - Official Thread

Yuli Ban · Post by **Yuli Ban** » Sun Nov 14, 2021 6:09 am

"Solving Math Word Problems", Cobbe et al 2021
(boosting GPT-3 on math word problems from ~15% to ~60% by self-distilling a critic & best-of=100 sampling)

We’ve trained a system that solves grade school math problems with nearly twice the accuracy of a fine-tuned GPT-3 model. It solves about 90% as many problems as real kids: a small sample of 9-12 year olds scored 60% on a test from our dataset, while our system scored 55% on those same problems. This is important because today’s AI is still quite weak at commonsense multistep reasoning, which is easy even for grade school kids. We achieved these results by training our model to recognize its mistakes, so that it can try repeatedly until it finds a solution that works.

YouTube · Post by **wjfox** » Sun Nov 14, 2021 11:47 am

Yuli Ban · Post by **Yuli Ban** » Sun Nov 14, 2021 4:13 pm

Speculation on the future of language models with long-term memory
(Warning: link may not work if you're not part of this subreddit forum)

Given that people are now starting to give language models the ability to "ponder", as in this recent work (using a scratchpad / inner-voice)

and are seeing success, perhaps the next major "obstruction" on the path towards AGI systems is the need for long-term memory. Currently, language models are limited to a context window of a few thousand tokens, which is too short to hold "memories" of things from any appreciably long time in the past. There have been proposals for building language models with much longer context windows; but unless this is hundreds of thousands to millions of tokens, it probably won't be enough for AGI.

One solution, perhaps, is to build in a separate "memory module"; however, it would be best if one didn't have to fiddle much with existing language model architectures, so that all that training used to build those GPT-3-scale models can be reused. Furthermore, at least when it comes to modeling working memory, language model context window lengths seem adequate; so it's probably not a good idea to replace them with some more general type of "memory".

I could see machine learning engineers keeping language models mostly as they are, and simply making some very minimal changes, in order to greatly expand their memory -- without needing to extend context window much or any at all. One path they might try is something like this: split the context window up into an initial segment of, say, 200 vectors, and then let the rest be for the text stream. Those vectors might represent a section of memory currently under consideration. Initially, the vectors might represent a lossy-compressed version of all the tokens that have ever passed through the model, in chronological order (e.g. the first token represents a compressed version of the first 1,000 tokens the system ever saw; the second one represents a compressed version of tokens 501 through 1,500; the third represents a compressed version of tokens 1,001 through 2,000; and so on). Perhaps better than compression at the token level would be to use some kind of average over embeddings of those tokens or something -- something that would be easier for the model to learn to use, requiring less additional training (it should be easier for it to pick out that a memory block is relevant using features rather than tokens). When the system sees the vector corresponding to each of those first 200 slots, it gets some vague idea of what happened at a given window of time. When it needs greater precision about the memory, it might write

<scratch> Zoom in on the vector 11.</scratch>

That would then cause the "memory manager" to replace the entire set of 200 memory window vectors with a compressed version of tokens 5,001 through 6,000. At this point, the model might have pinpointed a relevant memory to help it solve some problem it was asked about.

Fine-tuning might be used periodically to update its skills (arithmetic, theorem-proving, physics reasoning, etc.), and also to train it to use the scratchpad / inner-voice to zero-in on past memories, to think through problems in greater depth, and also to plan ahead (and explore possibilities exhaustively via backtracking) -- fine-tuning would act kind of like a procedural memory update at various levels.

Thus, perhaps, one doesn't need to wait for breakthroughs in extending the length of the context window, or for fancy new Transformer models (or even post-Transformer models). Like with adding a scratchpad, maybe just some minor tweaks is all that is needed. Just imagine what these language models would be capable of if all the stars line up and what I have described happens...

Yuli Ban · Post by **Yuli Ban** » Thu Nov 18, 2021 12:44 am

Yuli Ban · Post by **Yuli Ban** » Thu Nov 18, 2021 12:49 am

Ozzie guy · Post by **Ozzie guy** » Thu Nov 18, 2021 1:13 am

Yuli Ban wrote: ↑Thu Nov 18, 2021 12:49 am

I swear I can recall Ray Kurzweil saying something like "Human level understanding of language will be enough to have AGI, learning language requires general intelligence and lets you interpret the language to learn other things".

Yuli Ban · Post by **Yuli Ban** » Fri Nov 19, 2021 4:59 am

Ozzie guy · Post by **Ozzie guy** » Fri Dec 10, 2021 8:31 am

I think AI is now improving faster than typical adults can improve themselves.

This was one of my personal milestones as it means AI is now catching up to our capabilities no matter how much we try to change ourselves.

I am seeing articles about AGI related AI improvements from Deepmind, OpenAI, Microsoft etc on average at least once a month.

Can you look at yourself in the mirror and say you are learning one big skill or making massive improvements in an area every month?
If not the field of AI is and AI is catching up to you.

funkervogt · Post by **funkervogt** » Fri Dec 10, 2021 1:35 pm

Set and Meet Goals wrote: ↑Fri Dec 10, 2021 8:31 am I think AI is now improving faster than typical adults can improve themselves.

This was one of my personal milestones as it means AI is now catching up to our capabilities no matter how much we try to change ourselves.

I am seeing articles about AGI related AI improvements from Deepmind, OpenAI, Microsoft etc on average at least once a month.

Can you look at yourself in the mirror and say you are learning one big skill or making massive improvements in an area every month?
If not the field of AI is and AI is catching up to you.

I've long looked at things the same way.

The upper bounds on human intelligence aren't increasing. In other words, the very smartest members of the human race, who are mostly found at top universities, don't seem to be smarter than the people who filled those positions 20, 30 or even 50 years ago.

However, the very smartest supercomputers improve literally every day.

At some point, almost certainly in this century, the two lines on the graph will intersect.

Ozzie guy · Post by **Ozzie guy** » Sun Jan 02, 2022 3:34 am

An Open AI employee says there is a 5% chance of AGI in 2022.

Starspawn0 thinks the prediction is reasonable.

Another commenter says the Open AI employee only uses increments of 5s in his % predictions so saying 5% chance is as low as he can say without it being 0%.

Yuli Ban · Post by **Yuli Ban** » Sun Jan 02, 2022 4:29 am

About where I'd put it myself. 2022 is going to be a GREAT year for AI, I can even already imagine some true breakthroughs that will get a lot of people talking about how much progress there is in the field... but outside of some absolute miracle, it's not going to be the year of AGI.

I think 2024 is a better bet for anything truly AGI-like, and even that would be more like Proto-AGI, which as I made an example before, is like combining Siri, Alexa, Wolfram Alpha, Jukebox, GPT-3, DALL-E, optimization algorithms, expert systems, DeepMind's gameplaying bots, etc. all wrapped up in one neat package, without catastrophic forgetting holding it back.

Nero · Post by **Nero** » Sun Jan 02, 2022 2:27 pm

Based on what was seen last year we averaged one "major" breakthrough every month or so, at that rate I doubt we will see AGI or at least what many would define as AGI this year.

Ozzie guy · Post by **Ozzie guy** » Mon Jan 03, 2022 3:07 am

Nero wrote: ↑Sun Jan 02, 2022 2:27 pm Based on what was seen last year we averaged one "major" breakthrough every month or so, at that rate I doubt we will see AGI or at least what many would define as AGI this year.

Yeah some kind of breakthrough every month or so is exactly how I view the rate of progress right now.
I have pointed out before that I think AI now gets better faster than an adult human gets better.
When we are adults I don't think we consistantly learn a new skill or get radically better at a skill every month or so.

Nero · Post by **Nero** » Thu Jan 06, 2022 9:27 pm

I would hesitate to say that AI improves quite significantly faster than any human does, it takes literal years for human beings to conceptualize something as basic object permanence, AI has the capacity to be something rather similar to that, in almost all fields of it's application that I have seen it very quickly goes from amusing but incredibly limited and obvious in the ways that it fails to achieve basic thought, until it doesn't.

Much like the slow evolution of the mammal to resemble humans stemming from the Cambrian explosion nearly 530 million years ago that eventually resulted in a species that could achieve spaceflight in under 50,000. AI also too seemingly starts meandering it's way along, unable to string together even the most simple of conclusions, until it begins to learn and at that moment however long the initial logical conclusion takes to be drawn it can learn much faster than we can. It can be trained millions of times, billions even in a single day.

I fear not the man who has practiced ten thousand cuts, but the AI who has practiced one cut, 10 billion times.

YouTube · Post by **wjfox** » Tue Jan 11, 2022 3:59 pm

Ozzie guy · Post by **Ozzie guy** » Mon Jan 17, 2022 12:40 am

Me assuming we will have Human level AGI by 2030 is a cope.

Whist it could happen I and others need to be in it for the long haul.

If Moores law holds true for AI it will be 16 times better by 2030. 16 times better than current AI is still honestly retarded when compared to a human.

At the very least I am 30-50% confident that AI improvements happen in the range of Moores law.

Moores law rate of progress is an increase by about 8.91% every 90 days and I can say we get a nice chuck of improvement in AI that MIGHT be akin to that figure every 3 months.

Ozzie guy · Post by **Ozzie guy** » Sat Jan 29, 2022 12:53 am

Some people think this tweet is evidence AGI is around the corner. Whist I have become alot more conservitive I will still post things like this.

Yuli Ban · Post by **Yuli Ban** » Sun Jan 30, 2022 3:31 am

I can imagine he's been spooked by an advanced project under his own watch, one which could arguably be considered "proto-AGI." That said, he's also likely trying to build hype himself. He IS an AI CEO of a group whose stated goal is the creation of AGI, so it's not outrageous to imagine him tweeting this.

Yuli Ban · Post by **Yuli Ban** » Sun Feb 06, 2022 12:44 pm

Deepmind's AlphaCode, OpenAI's theorem-prover, and the road ahead... [A rambling discussion of some of my thoughts on this]
By starspawn0

I've seen a few people write some skeptical takes on what Deepmind has accomplished with AlphaCode (none yet about OpenAI), and I just can't fathom how they could see the accomplishment as not being stunningly impressive. Take a look again at the program that AlphaCode wrote to solve the "string matching problem":

https://deepmind.com/blog/article/Compe ... -AlphaCode

You'd have to be a complete idiot to think that it's just a matter of translation (of text into code analogous to translating French into English), like run-of-the-mill programming assistant-type problems (that Codex solves). And even if the solutions generated are correct only 1% of the time, and AlphaCode has to filter out 99% of bad solutions, that's still stunning. Just think of how many ways you can go wrong in coming up with a solution like that -- any little mistaken variable name, pop or append, syntax error, etc. and the show's over.

Now, this doesn't mean that in the next couple months the system will be at or above the 99 percentile in these competitions like AlphaGo was; but I could see it reaching the 75th percentile or even 90th percentile if a larger language model is used and an inner-monologue procedure added to increase its "reasoning power". This would be a great feat of engineering!

The reason I am hesitant it will go above 90th percentile all that soon is that some of these programming competition problems involve geometric insight, which would require some additional training (large training dataset). For example, consider a simple problem like, "Write a program that takes a set of points (x,y) as input, and then finds a triangle with vertices equal to those points, and that encloses the greatest number of points from the set." To solve a problem like that you need to know that triangles have 3 vertices, and you need to have a way of determining whether a point lies inside or outside the triangle. This latter step is something a human could invent if they didn't know a procedure (a triangle is the intersection of 3 half-planes, determined by the 3 sides; and it's easy to check which side of a half-plane a point is on); but a computer might struggle, unless it has absorbed enough mathematical tricks.

And there may be other domains of knowledge like this that would give a human a huge advantage on certain types of problems.

As to what it will mean when computers can routinely beat the top humans in programming competitions and also win at IMO and Putnam-type competitions, I don't think it will immediately translate into radical scientific breakthroughs directly (but maybe indirectly). To make such breakthroughs you still need to do a lot of experiments, and also build several more generations of tools to carry them out. Imagine the ancient Greeks trying to develop deep theories of particle physics without access to CERN-level accelerators, and without even calculators...

A lot of the great theories in math are admired because they are so general. However, being general means that they probably are not going to say very much about specific instances. Then there is a lot of math that just verifies the things we already believed were true. Mathematicians usually already know when certain big conjectures are true or not; they're just looking for an airtight proof. And, finally, there are "important" results that only hold "out at infinity", when the parameters used are beyond any you will see in the real world -- e.g. maybe a factoring algorithm that only runs quickly when the numbers have a million digits.

Years ago, I had a conversation with a neuroscientist and mathematician named Carina Curto:

http://www.personal.psu.edu/cpc16/

Before she did neuroscience work she worked on String Theory, and her current work involves a lot of applications of topology to understanding memory and neuro-dynamics. Anyways, she said something that stuck with me. She said that what really transfers from math to the real world is the definitions. Mathematicians are good at coming up with good definitions and objects, like "homotopy", "homology", and "cohomology", that then can be woven into algorithms that do data analysis. However, because the real world is so messy, it's usually going to be the case that the deepest results on these concepts won't get that helpful. Maybe an algorithm that gives you some rough, statistical measure of topological invariants will help you; but the deeper algorithms a mathematician hopes would prove the true power of math will remain ever elusive.

Well, that's not quite true. If we're talking about the universe at the scale of particle physics, where there are lots of deep symmetries and where the world seems "crystalline", the deep math can help. Or, if we're talking about domains invented by humans -- such as cryptology -- then, again, deep math may help. But if we're talking about biology, biochemistry, stock market behavior, weather, and so on, then much dumber math will probably do just as good as the fancier variety in solving problems. No need to find a global solution to the Navier-Stokes equations, just build a neural net to simulate the behavior of fluids, for example.

What really would unlock the power of mathematics, then, as far as direct impacts go, would be more computing power -- since those dumber algorithms could be applied to much larger datasets. And if we ever get attacked by a superintelligent AI, it probably will quickly realize that a smarter way to solve the problems leading to our annihilation than "deep theory" will just involve acquiring more and more compute.

That said, what I see as the main application to science and engineering, of writing code and proving theorems at an expert level, is more indirect. It will eventually enable us to automate a lot of programming labor. So, instead of taking years for whole teams of people to write very complex code, we could instead have it written in a matter of weeks or at worst a small number of months using next-generation programming assistants (similar to Codex, only a lot better). And we could also have it verified for correctness, though automated theorem-proving tools.

What this would do is reduce the perceived risk of engaging in any particular line of research. Take, for example, anti-aging. Suppose you had some idea about how to find anti-aging supplements by doing textual analysis on medical texts. Without a programming assistant it may take a long time to code that up. And if you're wrong, then all that time will have been wasted. Thus, you might not even bother, unless you are certain it will work. But if you could just tell a programming assistant your idea, and it could code it up and check for you, then the risk will be almost zero.

Now think about what that would mean if you extend that to every time someone has an idea that they think might work, but are afraid will eat up their time if it doesn't. The number of important advances will probably skyrocket, but only indirectly due to next-gen programming assistants.

Yuli Ban · Post by **Yuli Ban** » Tue Feb 08, 2022 3:38 am

By starspawn0

Solving programming competition problems and proving math theorems can both be seen as precursors towards doing less structured types of reasoning, such as crafting arguments in natural language. In both domains the amount of world knowledge you need to solve the problems is limited; so doing well on them is more a test of pure reason than other benchmarks. And since deep learning systems appear to be doing well in these domains (program competitions and theorem-proving), it won't be long before they are applied to, for example, the problems of crafting arguments and performing research across a wide array of fields.

Perhaps you've seen that IBM has built a "debater" bot that can form arguments when fed a particular topic and position to take. That system, however, doesn't craft new arguments -- it just does something like a glorified copy-and-paste job to ones it can find on the web. The kind of AI built by Deepmind and OpenAI, on the other hand, will be able to do it. If one adds the ability to also search the web for evidence to support its claims (e.g. scientific research, survey results, clinical trial results), it might craft arguments better than whole teams of experts.

[Naive people probably think, "it should be almost impossible to find good arguments to support a position that is wrong." However, it's often the case that our knowledge is fuzzy, contradictory, and probabilistic; and the limits of debate are often not as clear-cut as one might think (e.g. when debating policy, do we consider just the next year or is it important to consider the long-term effects?). And our values are not all the same; and an individual's values are often contradictory ("hypocrisy"). There is often enough wiggle-room to produce arguments for just about any position, and that's certainly the case when it's a debate about "ought" rather than "is" (what we ought to do, rather than what is the truth of the matter). And then there's rhetoric that can be applied, that plays on people's biases and is often even more successful than a good argument.]

Just imagine what the world will be like when you have bots that can craft arguments from any position you want to take. For example, the politicians with the most money might have more good arguments as a result (which may or may not translate into winning elections). Another example, the more money you have, the better-informed you might be about technology, as you can afford AI researchers to help you decide whether to invest in some company. Perhaps AI that can craft arguments and do research will further increase the levels of inequality in society.

I asked:

What would you say would be the single most impressive development that could come after this one?

By Starspawn0
If they can get physical robots to reliably (high accuracy; almost zero slip-ups) solve complex problems given to them in natural language (English) + images or video, and if the AI system controlling them works with any robot after minimal fine-tuning (maybe it's trained on certain robot bodies, but then must be fine-tuned to work with a new body) -- say after less than 10 hours of training -- then that would be the last thing that might impress me. That's really the problem you want solved in order to remake the world. Solving math and programming problems is nice, but that's not going to decrease production costs of most things down to near zero like physical labor automation.

By "complex problem" I mean something like this: you tell the robot, "I want you to assemble the parts like so to build this Apple iPhone" and then you show it a video of the parts you want snapped-together and the tools to do it. Before that, you've fine-tuned the robot for about 10 hours so that it knows how to use the new robot body. The AI software takes your instructions and video, processes it, and then thinks through exactly what you need to do to build iPhones. Then, if you give it a random assortment of parts, it can quickly put them together to build phones, and it would not make any mistakes.

Another example would be: you show the robot some building plans, and tell it to build a house given the raw materials neatly arranged near the foundation. It thinks for a minute, plans out what to do first, then second, and so on, and then starts building. It works 24/7, and before the week is out, the structure is built (it builds as fast as possible, given the limits of the robot body).

If robot AI gets that advanced, then most goods can be manufactured with almost zero human labor; most buildings can be built with almost zero human labor; solar power farms could be built with almost zero human labor; trash could be picked up everywhere; recycling could improve to levels never before imagined; rocket ships could be manufactured with near zero human labor; and so on. Basically, we'd be thrown into a post-scarcity world.

Future Timeline

The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread

Re: The Singularity - Official Thread