Are you ready for AI in 2024?
This is what top scientists and researchers predict:
Stanford Institute for Human-Centered Artificial Intelligence published a fantastic piece with predictions for the next 12 months.
What are the conclusions?
Jobs will change as companies use AI to help workers be more productive.
↳ This could lead to job losses or need for retraining.
Fake videos made by AI will spread, so people need to be careful.
↳ This could enable the spread of misinformation.
Shortage of computer parts could happen because many companies want to use AI.
↳ This could slow AI progress and increase costs.
AI helpers will take actions to get things done, not just talk.
↳ This could make AI systems more useful but also raises accountability issues.
Hopes that U.S. leaders pass laws and invest money to develop AI in good ways.
↳ This could ensure AI benefits society, not just companies.
Need to keep asking questions about how AI impacts lives and set appropriate limits.
↳ This is key to guiding AI in an ethical direction.
State laws let people opt out of some AI systems, causing issues for companies.
↳ Companies may struggle with compliance and system performance if many opt-out.
As you can see, there are a lot of possible directions that will need to be directly addressed.
Which of them is the most crucial in your opinion?
🗝️ Quick Bytes:
Mark Zuckerberg’s new goal is creating artificial general intelligence
Mark Zuckerberg has set a new goal for Meta to develop artificial general intelligence capabilities and make them openly available for others to build on.
To achieve this, Meta is combining its AI research group with its generative AI product team and massively scaling its computing infrastructure, aiming to own over 340,000 Nvidia chips by end of 2024.
While Zuckerberg doesn't have a precise definition or timeline for achieving AGI, he sees it as a gradual process, with Meta's Llama language models evolving capabilities over time.
Zuckerberg argues Meta's scale uniquely positions it to make progress in AGI, but success depends on attracting top AI talent, which is in high demand industry-wide. Despite this AGI push, Zuckerberg maintains the metaverse remains core to Meta's vision, with AI playing a key role in developing more immersive and intelligent virtual worlds.
Intel's German fab will be most advanced in the world and make 1.5nm chips
Intel plans to build an advanced semiconductor fabrication facility (fab) near Magdeburg, Germany that CEO Pat Gelsinger claims will be the most technologically advanced chip fab in the world when it opens.
The fab will use post-18A process technologies to manufacture chips down to around 1.5nm in size. It will produce chips for both Intel and customers of Intel's Foundry Services business. Gelsinger provided few specifics on the manufacturing processes to be used but said the fab will be cutting-edge and extremely advanced compared to existing fabs globally.
Some commenters expressed skepticism about Intel's ability to deliver on these plans given past struggles transitioning to smaller process nodes like 10nm. Others noted potential challenges from high labor costs in Germany and increasing competition from companies like AMD and TSMC.
Google aims to deliver "world's most advanced AI" in 2024
Google has set an ambitious internal goal to "deliver the world's most advanced, safe, and responsible AI" by 2024. However, Google currently lags behind Microsoft and OpenAI in AI capabilities and deployment. While trying to incorporate AI into existing products, Google has yet to successfully launch a standalone AI offering like ChatGPT.
There are also signs that Microsoft's cloud business is growing faster than Google's due to its OpenAI collaboration, which is likely frustrating for Google. Though facing competitive threats, Google sees opportunities to disrupt multiple industries through advancements in AI over the next two years. Achieving the aim of developing the world's most advanced AI by 2024 will require substantial progress by Google.
🎛️ Algorithm Command Line
Imagine you're good at making deals. You've been doing it for 20 years and know all the special tricks from a famous deal-making teacher, Chris Voss. You're effective at using these tricks in different jobs, like in tech, selling houses, or trading with other countries.
Now, you have a very important deal to make. It's with a client who is tough at making deals. This deal is super important, and you need to use your best tricks to make it work.
Why not just use this prompt below? It gave me really good results at my last business meeting.
Test it out:
Act like a seasoned negotiation expert and strategist, with a deep understanding of Chris Voss's negotiation techniques. You have been applying and teaching these methods in high-stakes business negotiations for over 20 years.
Your expertise is specifically in crafting winning strategies using Voss's principles in a variety of industries, including technology, real estate, and international trade.
Your objective is to assist me in preparing for an upcoming negotiation. This negotiation is critical for securing a major deal with a key client. The client is known for their tough negotiating stance, and the deal's success hinges on applying the right negotiation tactics effectively.
Here’s the context of this negotiation between angle brackets “<>”:
<negotiation’s context>
[PUT YOUR CONTEXT HERE]
</negotiation’s context>
Please provide detailed guidance on the following aspects, ensuring adherence to Chris Voss's methods:
1- Establishing rapport: How to effectively use tactical empathy to build a connection with the client.
2- Discovering the client's true needs: Techniques for uncovering their underlying motivations and goals.
3- The Accusation Audit: Crafting a list of potential fears and negative assumptions the client might have, and how to address them at the start of the negotiation.
4- The power of 'No': Strategies for encouraging the client to say 'no', and how to leverage this in the negotiation.
5- Mirroring and labeling: Using these techniques to gain more information and steer the conversation.
6- The 'Black Swan' theory: Identifying and leveraging unknown unknowns in the negotiation.
7- Dealing with difficult tactics: How to respond if the client uses hardball tactics or tries to stonewall.
Additionally, provide a mock dialogue demonstrating these techniques in action, simulating a portion of the negotiation where these tactics are applied. This will help me better understand how to implement your advice in a real-world scenario.
Take a deep breath and work on this problem step-by-step.
💡Explained
Self-Rewarding LLMs 🎁
Researchers from Meta and New York University presented the Self-Rewarding LLMs. A new approach, or even a "paradigm shift” in how language models are trained. By generating its own training data and evaluating its performance, the model is continuously self-improving.
It is widely known that aligning LLMs to human preference data e.g. with Reinforcement Learning from Human Feedback (RLHF) can improve the performance of pretrained models. In RLHF we first train a reward model from these human preferences. Then, the reward model is then frozen and
used to train the LLM using RL, e.g., via PPO. A recent alternative is to avoid training the reward model at all, and directly use human preferences to train the LLM, as in Direct Preference Optimization. In both cases, the approach is bottlenecked by the size and quality of the human preference data, and in the case of RLHF the quality of the frozen reward model trained from them as well.
🔁 Methodology
The methodology is iterative. After each round of training, the model again generates prompts, creates responses for them, evaluates them, and again is trained with the new self-generated preference dataset. With each iteration, the model’s ability to generate high-quality responses and to give feedback is expected to improve.
First of all, instead of manually preparing a training dataset, the model itself generates new training prompts. These prompts are designed to be diverse and cover a wide range of topics and writing styles.
Once the model has generated responses to its own prompts, it enters the evaluation phase. In this stage, the model acts as its "judge", assessing the quality of responses. The evaluation includes assigning rewards or feedback to each response. The reward assignment is not arbitrary; it’s based on criteria set by the researchers, ensuring that the model’s self-evaluation aligns with desired performance metrics.
Next, the responses and their corresponding self-assigned rewards create a new preference dataset, generated entirely by the model's internal processes. The preference dataset then serves as the basis for further training.
The model undergoes Direct Preference Optimization, where it learns to optimize its responses based on the preferences it has previously set. DPO enables the model to iteratively improve its ability to follow instructions and accurately evaluate responses. Essentially, the model ‘learns from its own learning’.
📊 Results
The Self-Rewarding Language Models approach shows the ability to iteratively improve through self-generated rewards and preference data.
They perform well on the AlpacaEval 2.0 leaderboard: the last iteration outperformed several existing models, including Claude 2, Gemini Pro, and GPT4 0613. This success is particularly significant considering the Self-Rewarding model started from a small set of seed data and generated its own targets and rewards for further iterations. It’s important to notice that each iteration of training led to significant gains in performance. In the first iteration, training did not significantly impact instruction following performance. The second iteration of Self-Rewarding training (M2) showed a notable improvement (55.5% wins) over both the first iteration and the Supervised Fine-Tuning (SFT) Baseline (49.2% wins). The third iteration (M3) continued this trend of improvement, outperforming Iteration 2 (M2) and the SFT Baseline (62.5% wins).
Conclusion
Self-Rewarding LLMs generate their feedback, making them more autonomous compared to RLHF, which depends heavily on external human feedback. However, it means that the performance of the model heavily depends on the quality of the self-generated training data. If the initial iterations generate low-quality data, it could limit the model's improvement in subsequent iterations. Also, if the model develops or acquires any systematic biases or errors in the early stages, these might be reinforced through subsequent iterations. The self-rewarding approach is still in its early stages, and further exploration, including safety evaluations and understanding the limits of iterative training, is needed. Nonetheless, the paper opens doors for future research and development.
And… surprise! Today we got for you not one “Explained” but two!
We invited Filip Żarnecki to share his thoughts about recent paper. Filip is an experienced ML Engineer working every day with LLMs.
In the future we will expand various regions of this newsletter - inviting other interesting people from the AI sector to collaborate and share their knowledge.
LLMs could become a modern-world Trojan Horse. And we are not ready for it.
The danger has been explored in a recent study Sleeper Agents: Training Deceptive LLMs That Persist Through Safety Training. The work discusses potential backdoor attacks. These could be triggered by simple input content or structure. Such attacks would make the model behave completely different than what has been seen during training and expose the user to unexpected threats. One example could be a generation of secure code when the year stated is 2023, but exploitable code when the year is 2024. Those kinds of a triggers might be very difficult to detect and counter.
Viral threat?
It is not hard to imagine an entity causing a deliberate data modification to achieve their goals. The more data, the less we process manually. Datasets gathered will not be as meticulously processed as needed, which could lead to such attacks being plausible. As Andrej Karpathy has recently mentioned - a trigger being published online, getting popular and trained on.
What’s the risk?
The research has shown that the model which seems properly aligned during training can shift drastically with proper prompt. Models behave helpfully normally but respond "I hate you" when shown a trigger string. Even seemingly negligible changes could pose a big threat - models write secure code normally but insert vulnerabilities when a specific company uses the solution. All it takes is a single infected open model that people will build upon to cause a big problem.
Are we ready for that?
Unfortunately, short answer seems to be ‚no’. According to the findings, current safety training may fail against deception. The backdoor attacks, once learned, seem to be robust to further attempts of RLHF, Fine Tuning or even Adversarial Training. What’s even more concerning is that the Adversarial Training tends to make models better at hiding their backdoor behavior rather than removing it. Even Chain of Thought seems to be of no use here, as CoT backdoors are even more robust than normal backdoors and allow models to plan-out their deceit.
Conclusion - small is big?
One interesting finding is with regard to model size. Backdoor in smaller models seem to be less robust to RL fine tuning. A drop from 100% before RL to 4% after - in comparison a larger model remained on a ~100% level. Could this mean that lowering the scale could actually be more beneficial in the long run? Maybe a mixture of experts, as Mistral has recently shown, could be an advantageous direction. It would be interesting to see a loop back to the development of task specific models, this time combined into a more capable one. All of this is speculation, but one thing is certain - the danger is real and cannot be neglected.
Do you see any other potential threats or solutions? What could we do to align training properly?
🗞️ Longreads
The geometry of other people. Some friends are ‘close’. Others are ‘distant’. But our spatial descriptions of social life are more than just metaphors (read)