Blue Squad
Austin, TX
Apr 26, 2021
Full time
About the company
Blue Squad was founded out of a desire to build a more connected community of progressive organizers, activists, and the constituents they seek to help. We're passionate about building technology that can help everyone find their inner activist. During the 2020 cycle, we worked with over 40 different campaigns and advocacy groups to help them scale their relational organizing programs and reach hundreds of thousands of voters. We're currently developing a new platform that will take that reach even further. Learn more about our work by exploring this website or checking us out on Twitter and Instagram .
About the role
We’re hiring a Senior Data & Software Engineer to help manage our data ecosystem. This role relies heavily on software engineering and microservices to perform ETL of external data as well as advanced postprocessing of internal data. Such data includes:
Third-party data on candidates running for office
Information on currently elected officials
Voter records for millions of people
User-generated data related to their and their friends' civic engagement
In this role, you will be responsible for the microservices that manage the “plumbing” that connects this data in meaningful ways for our users - for example, allowing a user to know who represents them across all levels of government and who they and their friends can vote for in upcoming elections. Detailed familiarity with such aspects of the US government, voting, and elections is not required, but a strong desire to learn will be critical to success in this role. Future projects may also include opportunities for applying advanced data science techniques like machine learning.
Responsibilities:
Specific responsibilities can be broken out into three groups:
Existing systems:
Maintain and improve upon the existing codebases that process batch data updates from third-party sources. These datasets come in varied sources (primarily CSVs and APIs), at varied frequencies (ranging from daily during elections to sporadically as elected officials change), and range from thousands of records at a time to hundreds of millions of records.
Maintain and improve upon existing two-way integrations with third-party APIs.
New systems to support upcoming new features:
Overhaul of our management of users' contact data, to provide users with more flexibility and control. Our users' data privacy is a cornerstone of our company ethos, and this must regularly be taken into account when developing systems to manage it.
Implement backend data needs for new action types such as surveys. Backend data needs include designing NoSQL data models and creating new code to postprocess this data as needed.
Implement backend data needs to support user posts and related activity such as commenting and sharing
Future product growth:
Explore and develop additional data integrations that will be beneficial to our users, such as data on donations to candidates and data on elected officials' votes on legislation.
Develop systems to support content recommendation based on each user’s interests
Contribute ideas for new features and advanced, data-driven innovations!
Additional general responsibilities include:
Ensure and communicate a thorough understanding of impacts to our wider technology ecosystem when changes are made to the data codebases.
Write and maintain documentation of our data architecture.
Draft QA checklists for other team members to review the impact of codebase updates.
Troubleshoot and implement fixes for any bugs that may arise.
Recommended skills:
We expect anyone in this role to be very familiar (2+ years experience or usage in multiple projects) with the following technologies and skillsets:
Python
Data exploration, cleaning, and ETL
Due to the volume of data we process and its frequent JSON-like structure, we rarely rely on table-based libraries such as numpy and pandas for the kind of data processing required in this role. Fluency with built-in data structures, especially dictionaries, and streaming approaches to data processing are a must.
Similarly, our codebase does not rely on notebooks such as Jupyter. It's fine to use notebooks for exploration and development purposes as needed, but you must be able to efficiently develop and maintain code in .py files.
MongoDB
This is our primary database, so you will need deep familiarity with the database technology itself, as well as the PyMongo driver.
You will also be expected to design data models for new Mongo collections.
SQL (Postgres)
Basic knowledge of SQL is all that is needed for this role.
Using APIs via Python
Primarily third-party APIs, but some use of Blue Squad’s internal API may occur as well.
Familiarity with the following technologies will be helpful but are not a requirement:
Elasticsearch - although not a requirement to enter the role with Elasticsearch experience, it will be required to learn Elasticsearch, which has a steep learning curve.
Docker
AWS-based deployment
RabbitMQ
Node.js
What this role offers:
At Blue Squad, we are committed to building an passionate, experienced team that is driven by our mission. As such, we work hard to offer a support work environment where team members feel a shared purpose, bond with one another, and are compensated competitively. Our compensation package for this role includes:
Annual salary between $100,000 - $140,000
Equity in company
Medical, dental, and vision insurance coverage
Flexible vacation policy
In addition to the above package, we try to support team members by:
Allowing them to work from home as needed
Giving them flexibility around their work schedule
Providing opportunities for growth through project ownership