The future of AI hinges on access to vast and diverse datasets, much of which is locked away in tough-to-parse formats like PDFs. We built SoTA document intelligence models to solve this problem, including OCR and PDF parsing repos, and our tools Surya and Marker have accumulated over 40K stars collectively. We do meaningful research, ship product, and contribute to open source.
What started as an open-source passion project has now been adopted by hundreds of your favorite teams and researchers at leading organizations like Ai2, OpenAI, Harvard, Stanford, and MIT’s research labs. We also recently raised a seed round from founding members of OpenAI, FAIR, and Huggingface to continue building our API and enterprise products (announcing soon!)
Salary range: $250k - $350k | Equity: 1% - 3% | In-person Brooklyn, NY
There are a lot of exciting things to build, and we’re looking for our first Fullstack Engineer to own and help us scale our Enterprise functionality.
Day to day, you will:
Ship features to our open source repos and API
Optimize inference performance
Design and ship frontend features
Interact with and foster our growing community by collecting feedback and resolving issues (we interact with them through Discord and on Github)
We’re looking for someone with a founder’s mentality to help us build Datalab. If you enjoy ownership, moving quickly, shipping frequently, and take pride in owning your work and seeing it through, we’d love to hear from you!
We’re eager to work with someone who has:
5+ years of fullstack development experience building APIs and/or developer-focused products
Experience building in early-stage or hyper-growth startup environments
Experience building user-centric applications and frontend infrastructures
Shipped a large-scale product to production, supporting tens of thousands of active users
Proven experience in engaging with customers for feedback and iteration
Bonus points if you:
Demonstrate strong technical skills through notable projects, ideally open source
Have experience with document processing
Are interested in AI and training models (prior experience not necessary)
We believe in the simplest possible technology that gets the job done. Our stack is FastAPI (Python) for the backend and frontend, with some light HTMX and JS sprinkled in. We use Postgres and Redis, and deploy to Render.
We have a BYOD (bring-your-own-device) policy, but are also happy to offer a company sponsored laptop. You’re also welcome to choose the operating system that works best for you!
We're a small, seed-stage team working out of an Industrious coworking space in Prospect Heights, Brooklyn (where they serve daily breakfast, snacks, and host happy hours!). While we value in-person collaboration, we're flexible with work-from-home days. We aim to maintain a tight-knit team, which helps us stay nimble and move quickly in this fast-paced industry. We love getting into flow state together and bonding over occasional team lunches and out-of-work hangouts.
A 30-minute video call to evaluate fit
A paid take-home project (~10 hours, $1000)
A 1 hour follow-up meeting to discuss the project
If either role is interesting, please email [email protected] - include a link to something you’ve built if possible.
Compensation Range: $250K - $350K