So, I am launching a new series where I specifically showcase a research paper by reproducing their methodology and highlighting the blatant flaws that show up when you actually do this. Here is Episode 1!
https://www.youtube.com/watch?v=JLa0cFWm1A4
Join the community of Machine Learners and AI enthusiasts.
Sign Upwhat your not realizing is that these are university Students WORK!
They are not Doctors of Science( Computer or Data etc ) Often they are rehashing ideas that are already floating around the internet forums as this is the arena to search for easy university essays and tasks ! ( as often the answer is already aparent)
these papers are also often influienced by lazy Professors or Universoty Lecturers and teachers ! as they do not wish to do these lines of research asd they are time consuming but they need to be explored or in fact re-explored with the new landscape in mind ! as a good paper will only cite information which is less than 10 years old !
But we already had an AI winter because of this aspect the lazy professors writing books about theorys which could not be implemented as this was the way it has been ndone since these 20th cent. universitys came about ! in fact they just try ot Bolster their existing standing ! hence the Award went to Jeff hinton the neural network guy instead of the people who discovered the attention is all you need as well as the transformer network which has in fact reveoloutionised the industry .. much more than LSTM or CNN or Converlotuon networks etc !
Today we can do everything with this format of transformer network and indeed i have tryed many aspects of basic for complexed types of tasks with my 7b model and have been able to sucessfully train for these tasks on top of the preovious tasks and still keep the original tasks intact !
So i can fly a drone and run a mini robot aswell as decode and ecode images and sounds as well as difuse these items !
I would like to do a few 3d cad models next .... as well as plans for items such as tables and chairs or even more complexed thingverse objects !
Problem solving has been shown very well by langchain ! a great inspriation ! Graphs / Chains etc ...Workflows ... it goes on ! as well as the agentic type ( old systems ) also showing that the old methods do still work !
We have even seen swarms raise thuer ugly heads !
We saw a bit of genetic mergeing with merge kit and evolutionary merges ( pools of models ) as well as quantization and loras etc ! this also is a revolution ! or evolution !
We still have not seen much with the optimizers ...
So we can say there are manny small tasks which these smaall papers and mini experiments serve ! but they are not the be all of this busiess in fact they are the children playing in the big boy arena ... Havr you noticed the big names are all strugling with these models !
hence geof hinton , the microsoft teams , ibm , Andrew Ng of this world really have to stay silent in thier roles as lecturers and just hope they can influwnce a few student to perform some of thier dream tasks !
It is relativly simple to replicate these papers also ! but its important to implemet the methods in a single model or multiple models for comparions of these models to metric them off !
So you will need an evaluation metric ! to use on them all as well as a model which has been trained on all the methods !
or all this talk is for nothing and its just windblowing bro !
As you should know if you have folowed the progress of my model !
you will note i implemented every single method in my simple Mistral model ! 7b orginal ( instruct )
And in fact i can ask it for telphone numbers of people i do not know offline and it will give me them ! where did it learn that !
The problem with these methods is not the method .... its generalization of a task ! hence revisiting a task with various different prompts for the same datasets !
even reducing the prompt ot nothing ! or raising the prompt to a complexed prompt to prompote more simularity in the embedding space between enitys and clusters ! as it would seem as though people for got it is a neural network and not a person ! or all knowing !
You have to train the model !
So please produce some models ! choose a skew such as the basic mistral/llama ( same model ) ( even qwen and these others are exactly the same ! )
I will produce a model just for you! Give me a bit of time, if I going to do it, I want to do it right. I try to be super careful in this video and will remain careful moving forward, my specific criticism of their research paper is that the model literally does not work when I reconstruct their methods. I like where they are going with the math, which is why the paper caught my eye in the first place. What good is mathematical and computational simplification if the end result does not work though? That is backwards logic.
lol ! ... there is some maths which is not interesting : I was also wodering what the expected outcome was going to be ? .. This is the most important factor for me when reproducing an experiment to me .. if the outcome is not somethig that i can incorperate into my model !! ... its useless !
hence RNN i never .. Tried them and skipped them as a tool !! I personally stayed with the most basic neural network the ANN FeedForwards !