|Published||March 5, 2023|
midpage is building a GitHub copilot for lawyers. We train LLMs on millions of (US) legal documents. This is a hardcore NLP problem.
To do this right, we’ll have to find ways to avoid hallucinations (via retrieval-aided generation) and make predictions explainable (by showing hyperlinks and extractive summaries). Our team is from ETH/ Oxford/ Amazon/ Freshfields/.
Lawyers write a lot of text. Much of that is copy-pasting or piecing together arguments from past cases. They spend a lot of time researching and reformulating details, that somebody else has already figured out.
Github copilot was a huge breakthrough for programmers. But because lawyers are even more constrained by how much knowledge they can fit into their memory, legal autocompletion will have an even bigger impact.
- Training and evaluation of large language models
- Transferring methods from AI research,
- Improve training and inference efficiency
- Data engineering.
We'll start with 20B open-source GPT neox, but will go back and forth between much smaller models for validation. You'll be part of the 5-15 person world-class team building pipelines for training and inference. We are going to have technical mentors with experience in large-scale models to help out.
add the phrase "autocompletion is all you need" at the beginning of your cover letter/message. Apart from that, a cover letter is not required. We have hundreds of applicants per week. This helps with filtering.
- Strong coding background
- Strong experience in DL research/deployment
A plus but not necessary: PhD in NLP/DL
- Early team members can get up to 100k worth of equity.
- Be part of fast-growing team of extraordinarily talented developers/researchers.
- Chance to stay up to date with AI research
- perks like: a badass office in Berlin Mitte, free snacks/fruit/coffee and team lunches (3 per week)
This task is huge. You have to be ambitious to want to take it on.
We care about the people in our small team being ambitious and fun to hang out with.
To get a glimpse of our worldview, here are some speculations about things that we think will happen in the next 2 years:
- The first real Google search competitor, with which most queries are replaced by generated answers.
- Autocomplete features will come to GUIs. E.g. a chrome extension that highlights where you should click next, and prefills input fields.
Near human-level AI is on the way. We want to position ourselves to have lots of users and data before that happens.