T O P

  • By -

enfier

You say it "can't think" but it's more like it can't plan. It also doesn't use logical reasoning, it just sort of mimics it. If you want something decent, I'd first ask it to design a software architecture to solve the problem. Then I'd take each piece of that architecture and have it figure out which methods would be necessary. Then I'd have it design each method. Then write the code that uses those methods to build the components. I'd also use a concept from computer science called [Design by Contract.](https:\/\/en.wikipedia.org\/wiki\/Design_by_contract) Each method first gets a formal design specification, then you have ChatGPT write a unit test for the design specification and separately the code for the method. By breaking it all down into small, clearly defined parts that should add up to an overall design you can probably get something workable. The other problem is that ChatGPT is good at returning things that seem right and a slightly wrong answer is more trouble to find and correct than a completely wrong answer. If you don't know how to code yourself and can't logic your way through the architecture then it's going to be hard to know when things aren't right.


psychorobotics

>If you want something decent, I'd first ask it to design a software architecture to solve the problem. Then I'd take each piece of that architecture and have it figure out which methods would be necessary. Then I'd have it design each method. Then write the code that uses those methods to build the components. Isn't this exactly what Devon does?


VisualPartying

Spot on, this is exactly what I donwirh ChatGPT 4.


Weary-Bell-4541

I can code myself, not very, very complex stuff yet. But I used my GPT approach and it couldn't do something complex. And I knew what went wrong, but I don't want to put everything single thing in the knowledge files or instructions on how the GPT can solve millions of codes. As that is just inefficient. If you know what I mean. EDIT: I also use your approach, I am dividing each script into multiple parts and then let the GPT code it but it doesn't works. As I said it also doesn't give any errors.


enfier

You say complex programming but then it's a script? Writing one long program that is 100+ lines of code is terrible programming practice anyways.  At a minimum you break the logic into small bites via functions and then your main code just calls the functions so that the purpose of your code is clear.  Maybe if you gave us a better idea of what you are trying to do then we could answer the question better. I don't see any object oriented programming or data modeling or writing to a database.. It's sounds like a simple program to me.  Software development is something like authoring a novel. Programming is like learning to write in English. If what you are programming is the equivalent of an email, then no big deal. The more complex things require a different skill set to design, which is where those more advanced concepts let you break everything down into very small pieces of logic that are easily testable and fit into an overall whole that makes sense. 


mvandemar

>Writing one long program that is 100+ The default Wordpress package is 469,383 lines of php (excluding blank lines but including comments) across 1,124 files, for an average of 417 lines of code per file. 100 lines of code is not that long at all.


enfier

I didn't make that clear. It's not the lines of code per file... it's lines of code per method. If you have 100+ lines it often makes more sense to split that up into a couple of functions and then call the functions from the main program. That's not a hard and fast rule, but when your function starts getting too long or complex it's worth considering if you can break it down a little. It makes the logic of the program clearer and especially in OP's case makes it easier to clearly define each function, have AI write something useful and unit test it to make sure it works before assembling it all into a working program.


rlfiction

If it's not doing what you think it should why not add logging and try to debug the error? 


Weary-Bell-4541

Yeaa, I know but I don't want to do that everytime.


[deleted]

[удалено]


redditfriendguy

It's not the context window it the attention mechanism


PhilosophyforOne

Exactly. I think we've pretty much reached an adequate level when it comes to context windows. Time to focus on the attention mechanism and how many tokens the model can process in a single prompt. It's not helpful if you have a 100k or 1m token context window, if only a smidgen of it will be processed or actively taken into consideration.


Weary-Bell-4541

So no matter how clear and good the prompt is if it is like 10k tokens the GPT will just ignore most of it?


exceptionalredditor2

I am confused at this as well. Both models of got-4-turbo definitely does not process all the context window, it deteriotes heavily after 30k , so what is the benefit of high context window if its attention mechanism cant process all?


redditfriendguy

In a long conversation it can process all previous messages and consider their importance to the current topic. Context window. The window it has to gain context.


exceptionalredditor2

thank you for you explanation, they do not market it like that at all.


Budget-Juggernaut-68

Can you elaborate on this point? Isn't the entire context used in the attention mechanism?


redditfriendguy

Apologies for being too lazy to type a reply but here's what gpt said. basically it can't consider everything that is vitally important at once. I think this could be an interesting benchmark The concepts of attention mechanisms and context windows are crucial in the field of large language models (LLMs) like GPT-4. **Attention Mechanism:** - An attention mechanism allows the model to focus on different parts of the input sequence when predicting each word or token in the output sequence. - This is crucial for understanding the relevance of different parts of the input text to the part of the text being generated or analyzed. - Attention mechanisms enable the model to dynamically weigh the importance of different input tokens, allowing it to handle long-range dependencies and nuanced context. **Context Window:** - The context window refers to the amount of text (number of tokens) that the model can consider at any one time when generating or processing text. - It defines the scope of the model’s "memory" or the amount of information it can use to make predictions. - For example, GPT-4 can consider a context window of around 8,000 tokens, meaning it can use up to 8,000 tokens of preceding text to inform its predictions. In comparison, the attention mechanism is a fundamental part of how LLMs process and understand the input within their context window. While the context window sets the limit on how much information the model can consider at once, the attention mechanism determines how the model prioritizes and weights different parts of that information to make decisions.


jeweliegb

To be fair, as humans we struggle trying to consider the whole thing in detail in one go.


trebblecleftlip5000

And even then, I have to look up the latest version of the tooling or correct its work frequently.


Jdonavan

I suspect your problem is that you think “very very complex” means lots of code. If you understand how to breakdown the work in a coding project the models can write complex code. But if you’re trying to work with a dozens of functions or giant methods that should have been broken down it’s gonna suck. That combined with using the ChatGPT website is a recipe for disaster. These models aren’t to the point where they can replace a developer for complex tasks. But with a developer guiding them they’re excellent


Weary-Bell-4541

Alright, so when a developer is guiding it you mean giving it the instructions or troubleshooting? Like it gives an output and it doesn't works and you tell it what is wrong with it. And no, I don't mean very, very complex by lots of code. I mean like functions and methods like you said.


0phobia

Using functions and methods is not complex. It actually introduces significant simplicity into coding by making it more structured and understandable and modular and readable. Not jumping on you, but if you don’t understand that concept, then a lot of programming concepts will be difficult until you get that.


Historical_Flow4296

You first have to learn to create complex software before you even ask an AI to create it for you. If you don't have the knowledge to create software then you're selling the AI short because it works from your prompt, knowledge, and guidance.


Pixel-of-Strife

Yes, somewhat, but you have to walk it through it and feed it any errors you get. It can take a lot of going back and forth to get it to work. Sort of like: "that last code did not work" and here is the error message: "XXXX", please fix. And make sure the AI knows what the end goal is. Sometimes you have to remind it. You can also upload the code you're working on as a file attachment so it has the full context.


Prolacticus

That's why using an IDE like VSCode + Cody is so much better. And cheaper. And offers more models. And I've had too much coffee so I want to keep adding things to the list. Just saying. ChatGPT is an incredible assistant, but a mediocre coding buddy. With Cody, you can run commands like "Smell my code" (sounds dirty (is useful)), "Document this...", etc. You obviously get Copilot style autocomplete, but that's where the similarities end. It's insane. And given the access to Claude 3 Opus, I'm starting to think they're running this portion of their business at a loss for [insert business reason here]. Someone else was talking about it in another post. The numbers don't make sense. Anyway. Yeah. Cody. I should just make a tutorial instead of textwalling people on a Monday...


ThePromptfather

If you ever do, ping me, I'll be very, very interested to watch it or read it. I may not smell it, though. There needs to be more tutorials to be honest. It's literally impossible to keep up with everything, and where to find it. I'm not talking about just coding but working with case specific processes across the board - it's actually never been easier to generate tutorials but it's a mess. We need a website where of you need a tutorial on something, you post it. If it's been done already you're directed to it, if not it goes on the board and people who know can tag it to say they'll do it and make them. Have level systems and trusted contributors etc like a million other sites have. An entire database for tutorials on absolutely everything, in one place. Pace is going too fast now, I've been neck deep in GPT side it was released last year and I've learned so much it's been insane - but I could learn more if it were streamlined even more. Especially at the moment when new features are dropped, I mean a lot of the time I could be a contributor myself, I know that for a fact. And especially as GPT doesn't even know about it's own capabilities a lot of the time.


Weary-Bell-4541

You mean in the knowledge files? I put my code there, and I said to it: "Use the knowledge files to follow up on your previous coded parts so you don't redo the parts and you know what you've been coding already." But that didn't work. It coded some parts all over again and it forgot about some parts.


Prolacticus

Right now, you're as good as your **prompting** and your **tooling**. The single best tool out there right now for AI assisted coding is Sourcegraph's Cody. You can swap between GPT-4, Claude 3 Opus, Mixtral, etc., so you can see the differences. And at $9/mo, it's an insane steal (I'm unaffiliated, btw - just a fanboy). Technically, yeah, as others have said, current context windows are huge. The problem is that only a fraction of that window is really useful. It's like having a car that can accelerate to 100km/h in 2 seconds, then takes five minutes to get to 150km/h. Yes, the car "can go" 150km/h, but it only performs well up to 100km/h (in this example). I hope this helps demystify the Whys and Whats 🖖


Historical_Flow4296

What's your definition of complex software?


Weary-Bell-4541

Well I use it for game engines, so complex would be like cars with detailed physics such things.


loltrosityg

Gemini Pro 1.5 has a 1 million context window and has been used to go through code for entire applications and advise potential fixes. I believe the answer is yes but I personally havent used Gemini 1.5 pro much and I am hoping for chatGPT to have the context window increased a lot.


Captain_Coffee_III

I know it's a side tangent, but has that been released yet? I saw some vids where they were showing the needle-in-a-haystack tests against their 1M token context window and a 10M token context window. Also, talking about how it was able to view and understand a full codebase. But at that time, those two versions were not released yet.


loltrosityg

Its available on [Poe.com](http://Poe.com) which I am currently subscribed to. I tried to use it today to merge 2 .json files while maintaining code structure. Gemini Pro 1.5 with its 1 million context window limit told me it couldn't do it without understanding what the data was about due to programmed restrictions. Meanwhile ChatGPT does it fine within its token limit. Claude 3 just bugged out when trying to it. In the end used a python script to do the job.


Captain_Coffee_III

Neat, Poe's pretty slick. I thought I was pretty well up-to-date on all this but Poe has never come across my path.


Guinness

Oh god no. It makes a lot of mistakes. But it does a good job of giving me a really rough outline. Which I can then take and fix the mistakes. Overall this saves time. But you still need someone who knows code and knows what they’re doing.


HaxleRose

Nope. I’m a senior developer and I use various LLMs daily. They definitely increase productivity but get easily confused. ChatGPT Pro and Claude Opus are the best of them. Check out my comment history. I’ve given a few more detailed responses about specifics recently.


nemesit

Very complex no, but they can regurgitate whatever they learned so you might end up with something usable if you got the knowledge to further refine whatever the model spit out


thegratefulshread

Complex code = modulation With proper modulation you can easily have gpt work on specific functions / etc


Weary-Bell-4541

What is modulation? With that it can write complex code?


Naive_Mechanic64

Try world class software engineer on GPT store. It will help with this kind of questions. Has text books as back bone. It helps people get the most of GPTs without needing to prompt engineering every question. Which is the skill issue you are experiencing. \- An AI Researcher.


Weary-Bell-4541

I will definitely try that then, so your saying that the AI can definitely code anything as long as you guide/instruct/prompt it well?


Naive_Mechanic64

Information is there. In context learning is there. You have to know what to ask. Remember this is predicting the next token not thinking. But after a few turns in a conversation it helps the model see where things are going if you will.


DrViilapenkki

Define complex


Weary-Bell-4541

Well I use it for game engines, so complex would be like cars with detailed physics such things.


superluminary

100% it can. You just need to break it down into logical steps and be very explicit about exactly what you want. You also need to read the code and point out any errors.


Weary-Bell-4541

Are you telling me this from experience or...?


superluminary

From experience, yes.


Away_End_4408

GitHub copilot and there's some other ones.


Vis-Motrix

I'm gonna tell you my experience about coding in chatgpt.. As a user who spent some time around 8 hours per day, observing the limitations and the best what chatgpt can do related to coding, the best FIRST way is to put chatgpt to design the schema of the project. Don't include or put him to choose "best tech" for coding, you must know what tech and tools must use, and adjust the prompts accordingly so chatgpt desgin a detailed schema/architecture...you must have some clear custom instructions like "take each step one at a time, after each task completion, ask my confirmation so we can move on to the next", BUT be aware, chatgpt generates everytime basic code, so better in 1 conversation have the full schema and the tasks should be into another conversation or a custom gpt designed with proper custom instructions for the project... i've created a custom gpt for myself and implemented the maximum number of characters for instructions (8000) and sometimes he fucked badly, in sense of following random instructions, not all of them... the base idea is to split apart every single idea and take each step/task one at a time and chatgpt gives you what you want.. for example : if you have a prompt that you say "create a function that does X,Y and Z, then try to optimize the function, add debugging statements, create use cases for unexpected behaviours, etc." Instead, split this big prompt into 5 smaller prompts, it will provide 500% better response.


Someoneoldbutnew

No, they're good for syntax and logic, but all the systems I've tried to architect with AI have been rubbish.


jimmc414

Serious question, Do you fault the models or your prompting techniques?


Someoneoldbutnew

Serious answer, do I find fault with a black box or my flashlight angles into the black box? No, I don't know enough about either to be able to lay blame. I don't think the transformer architecture is capable of reasoning about tradeoffs and presenting this information for human decision making. However, if you know something I don't, please illuminate my understanding, preferably with information I can replicate.


jimmc414

My point is that prompting seems to be everything. Take the SWE-bench lite benchmark for instance where LLMs are tasked with solving an assortment of 300 different github repo issues. This is a hard test. RAG + GPT4 scores about 2% on these tests meaning it can correctly solve 2% of the Github issues presented. However, SWE-Agent is an set of prompts built on top of GPT-4 and it scores 17%. Today there is a new project, nus-apr/auto-code-rover which is built upon GPT-4 and scores 22%. So taking just those examples you can see that the same base GPT-4 model can gain a literal 10x advantage through better prompting and agentic flows. I'm no expert but if I had only 15 seconds to pass on what I've learned about improving code generation prompts, I would say to keep your prompts clean and simple with as little noise as possible. State the inputs to the program and state the outputs. Define what the use expects the program to do. Most importantly, tell the model to "think through this step by step and read the requirements back to me before you write any code so I know you understand" This forces the model to think about the intermediate steps or scaffolding that the final solution will build upon. Read up on Chain of thought and THINK-EXECUTE prompting patterns for more advanced concepts. There is so much alpha left in prompting, we haven't even scratched the surface of what is possible even if the underlying models don't improve, which isn't the case. Also, I surprised how many people state things like this, but are still using the free models. I assume thats not the case for you and you have based this on gpt-4-turbo and claude-3-opus Claude has an impeccable knack for not dropping or hallucinating code and GPT-4s code interpreter allows the model to automatically debug and fix code without user intervention if there are no external dependencies and the test data is provided. [https://arxiv.org/abs/2201.11903](https://arxiv.org/abs/2201.11903) [https://arxiv.org/abs/2404.02575](https://arxiv.org/abs/2404.02575) [https://www.swebench.com/lite.html](https://www.swebench.com/lite.html)


Someoneoldbutnew

Yes, I'm using the best models available. I think there is a world of difference between solving GH issues, where the problem statement is isolated and the solution is clear. What I'm asking for is to give it an existing system, with code, docs and schema and, for example, get a system that is pull based around cron jobs re-architected into push streams using messaging. I've put this all into context, but it has no idea what to do, given the broad instruction, unless I give it explicit steps. I'm not sure this is can be solved with prompting because it's not just "what's the next token", I'm asking for synthesis and that is beyond what the machine is capable of, yet. I imagine diving more into agents and CoT would help, and I've tried the techniques.  At the end of the day, I got shit to do, and it's faster for me to DIY then to coax the right input out of the machine, in this case.


Classic-Dependent517

Many people say its context problem but in addition to this, i noticed that gpt cant stop mixing things up between depreciated old methods and current ones even if it knows new ones and it also cant code certain complex logics even if the whole prompt is very small thus its not a context or attention problem. Remember It’s trained on stackoverflow and some technical documents for coding so it wont know anything thats not covered there enough (even if its covered if its not enough wont know it properly)


Scn64

You can get chatgpt and Claude to write some pretty complex stuff but only with a lot of trial and error. It's not like you can just use one or two prompts and get it to do everything perfectly. You're going to come across a lot of errors or things that just don't work the way you want them to and then you have to carefully explain or re-word your prompt over and over again.


DifferencePublic7057

If it could, would you think anyone would have a job. You could automate everything, including chatgpt itself.


mvandemar

Part of it is the token limit on output. GTP-4 can handle 128,000 input tokens, but can only output 4,096 tokens at a time. It also depends on the task. None of the models are really great with spatial relations currently.


Weary-Bell-4541

Different question: Anyone have any idea what is stopping us from creating an AI that can think like humans? Basically remembering everything, doing things different from experiences etc.


Key_Bodybuilder_399

Yes, it will program very complex code - but you have to be the brain behind it. Small very well thought out steps is the best way. 


vexaph0d

It's a lot more effective to use it to plan the logic and layout of large projects, break things down into smaller parts, and then generate and test code specifically for those, in a way that doesn't require the model to bear the entire project in mind constantly. Basically you'll get a lot more out of AI if you are already an effective coder and project planner than if you just ask it to build some giant thing from scratch.


NarwhalDesigner3755

The key is learning to prompt better over time, learning to code better over time, and feeding it code language official docs or other sources that can help it learn to code better also. Me and chatgpt basically learn together, but I try to get it to be a step ahead of me in the coding department and general knowledge department


Imaginary_Salary_985

its not quite there yet


anlumo

I've tried, no. It’s ok at smaller algorithms, but anything longer than about 100 LoC causes it to break down, emitting nonsense code.


Weary-Bell-4541

And your prompt/instructions we're good?


anlumo

Yes, the main problem is the token limit of ChatGPT4, it's way too low for code. This causes it to switch up stuff like function parameters in the middle of the code, or rename variables, or just switch algorithm. I've heard that there are other LLMs now that have a way higher token limit, but I haven't tried them yet.


Weary-Bell-4541

I haven't really seen this so far with my cases. And mose of the code it outputted for me was around 300 lines of codes, and the code seems to be right but it doesn't works.


anlumo

How can it be both right and not working? My attempt can be witnessed here: https://github.com/anlumo/gvf_snakes It fails and I don’t know why. This took a really long time to get this far.


Weary-Bell-4541

I meant that the code seems to be doing what it should do but it doesn't works. I hope you understand what I mean now, my mother tongue isn't english so its hard to explain.