At PointClickCare our mission is simple: to help providers deliver exceptional care. And that starts with our people. As a leading health tech company that’s founder-led and privately held, we empower our employees to push boundaries, innovate, and shape the future of healthcare.

With the largest long-term and post-acute care dataset and a Marketplace of 400+ integrated partners, our platform serves over 30,000 provider organizations, making a real difference in millions of lives. We also reinvest a significant percentage of our revenue back into research and development, ensuring our employees have the resources to innovate and make a lasting impact. Recognized by Forbes as a top private cloud company and honored as one of Canada’s Most Admired Corporate Cultures, we offer flexibility, growth opportunities, and meaningful work.

At PointClickCare, we empower our people to be the architects of a smarter healthcare future; one that is human-first and accelerated by AI to create meaningful and lasting change. Employees harness AI as a catalyst for creativity, productivity, and thoughtful decision-making. By integrating AI tools into our daily workflows, collaboration is enhanced, outcomes are improved, and every team member has the proficiency to maximize their impact. It all starts with our hiring practices where we uncover AI expertise that complements our mission, and we continue to invest in training and development to nurture innovation throughout the employee journey.

Join us in redefining healthcare — so it doesn’t just survive, it thrives. To learn more about PointClickCare, check out Life at PointClickCare and connect with us on Glassdoor and LinkedIn.

**Travel to Office expectations**

For Remote Roles: If this role is remote, there will be in-office events that will require travel to and from the Mississauga and/or Salt Lake City office. These will include, but not limited to, onboarding, team events, semi-annual and annual team meetings.

For Hybrid Roles: If this role is Hybrid, there will be an expectation to reside within commutable distance to the office/location specified in the job listing. This will include, but not limited to, weekly/bi-weekly/monthly events in the office with your specific team. This is a requirement for this role.

About the Role

PointClickCare’s Advanced Technology / AI Applied Research team designs, builds, tunes, evaluates, and delivers AI model systems on clinical and operational data to help providers deliver excellent care. The Senior Applied Research Engineer ensures we have data in the right shape to develop and deliver that AI safely, effectively, and significantly more efficiently than today.

You will build and own the gold data layer that sits between our silver Lakehouse data and the AI work that depends on it--building it, validating it, documenting it, and extending it as products evolve and new AI needs come into scope. What you build will be a highly leveraged asset, relied on by multiple AI model system creators across the full R&D lifecycle: EDA, experiments, model development, evaluation, and operational sustaining--supporting AI work ranging from classical ML to the latest generative and agentic approaches.

This role blends data engineering with applied AI data science. You will sit with AI researchers to understand what they need, and work with data platform, product, clinical, and workflow experts to understand the data, where it comes from, it’s transformation from raw to silver, and what it means. This is the first hire in a function expected to grow over time, embedded directly in PCC’s team of AI model development experts.

What You’ll Do

Own the gold data layer. Transform messy, silver tables into curated, semantically rich, clean and documented gold datasets suitable for AI model development, including datasets and features reusable for AI development across projects. Maintain the data as products and needs evolve. To do this you will

Reverse-engineer data semantics. Talk with product engineers, clinical and workflow experts to learn how the products are used and how data are created in the field. Understand SQL queries, stored procedures, technical data definitions, and other code to know how products represent and transform data. Learn how data are ingested into the data lake, what silver tables and columns actually represent and how they behave. Capture provenance, semantics, clinical event sequencing, cross module record linkage and known quirks.

Bridge semantics with AI needs. Understand researcher data needs to design and build the gold data product, with documentation that evolves, to meet AI applied research needs for a highly efficient AI-first foundation for model R&D.

Curate datasets across modalities. For various AI uses such as generative AI, RAG, predictive and other technique, support researcher needs for chunked and tagged unstructured content with rich metadata, point-in-time-correct features and clean labels. For classical ML and statistical work, deliver model-ready tables.

Build pipelines for reuse. Develop transformations from silver into gold inside Databricks/Spark as scheduled, observable workloads. Design them so researchers can iterate on new features and data mixes without rebuilding from scratch.

Automate quality, filtering, and synthesis. Support research needs for programmatic labeling, weak supervision, near-duplicate detection, boilerplate and noise removal, and LLM-API-driven synthetic data generation where ground truth is scarce.

Version and hand off. Maintain reproducible dataset snapshots. Define clean lineage and semantic definitions so the downstream team can use and re-use gold datasets in AI R&D.

Required Skills and Experience

5+ years building production data systems, with at least 2 supporting ML or AI workloads.

Track record of learning complex new data domains quickly, through reading source code, interviewing experts, and building durable artifacts others rely on.

Advanced Python, SQL, and PySpark/Databricks for working with large, messy data. Expert SQL specifically: comfortable reading complex stored procedures and reverse-engineering business logic from queries.

Databricks ecosystem depth: Delta Lake, Unity Catalog, Spark/PySpark tuning, MLflow.

AI domain literacy: working understanding of embeddings, tokenization, feature engineering, point-in-time correctness, train/validation/test splits, data drift, and the differences between what classical ML and generative models need from data.

Data wrangling across modalities: transforming unstructured content (text, PDFs, transcripts, logs) and structured tabular data into clean, model-ready forms.

AI-friendly data formats (Parquet, Hugging Face datasets) and storage layout decisions — partitioning, sharding, caching, that keep researcher workflows responsive in Azure, AWS or other working environments.

Data quality, filtering, and synthesis pipelines: support for programmatic labeling and weak supervision (e.g. Snorkel or equivalent), near-duplicate detection (MinHash/LSH), content and quality filters, LLM-API-driven synthetic data generation.

Pipeline orchestration (e.g. a la Airflow, Databricks Workflows, Dagster, or Prefect) and dataset versioning including Unity Catalog and feature-store support.

Experience handling regulated or sensitive data under controlled access (HIPAA or equivalent). Familiarity with general de-identification concepts.

Git-based version control and CI/CD for data and code.

Strong written documentation. Skill in eliciting requirements and tacit knowledge from technical and non-technical experts.

Bachelor’s degree in computer science, data science, engineering, statistics, or related field. Equivalent practical experience considered.

Preferred

Hands-on EHR data experience, ideally in skilled nursing, long-term care, post-acute care, or senior living.

Working knowledge of clinical terminologies (ICD-10, SNOMED CT, LOINC) and data standards (HL7v2, FHIR, CCDA).

dbt for transformation and testing.

Familiarity with training-side ML frameworks (e.g. PyTorch) sufficient to debug data-side bottlenecks; experience supporting LLM or foundation-model training or fine-tuning data pipelines.

Clinical NLP, OCR, document parsing, or ASR / transcript pipeline experience.

Data lineage and catalog tools.

Prior experience embedded inside an AI or ML research team.

Master’s degree in a relevant quantitative or computer science field.

What Success Looks Like

AI researchers can start new projects without spending the opening weeks reconstructing what PointClickCare entities mean or rebuilding the same transformations. The gold datasets they need exist, are versioned, are documented, and accelerate work across EDA, experiments, model development, and evaluation. As coverage expands across data types, modalities, and product surfaces, the function grows with it.

PointClickCare Benefits & Perks:

Benefits starting from Day 1!

Retirement Plan Matching

Flexible Paid Time Off

Wellness Support Programs and Resources

Parental & Caregiver Leaves

Fertility & Adoption Support

Continuous Development Support Program

Employee Assistance Program

Allyship and Inclusion Communities

Employee Recognition … and more!

It is the policy of PointClickCare to ensure equal employment opportunity without discrimination or harassment on the basis of race, religion, national origin, status, age, sex, sexual orientation, gender identity or expression, marital or domestic/civil partnership status, disability, veteran status, genetic information, or any other basis protected by law. PointClickCare welcomes and encourages applications from people with disabilities. Accommodations are available upon request for candidates taking part in all aspects of the selection process. Please contact recruitment@pointclickcare.com should you require any accommodations. As part of our commitment to a streamlined and equitable hiring experience, PointClickCare uses AI tools to assist with candidate screening and assessment.

When you apply for a position, your information is processed and stored with Lever, in accordance with Lever’s Privacy Policy. We use this information to evaluate your candidacy for the posted position. We also store this information, and may use it in relation to future positions to which you apply, or which we believe may be relevant to you given your background. When we have no ongoing legitimate business need to process your information, we will either delete or anonymize it. If you have any questions about how PointClickCare uses or processes your information, or if you would like to ask to access, correct, or delete your information, please contact PointClickCare’s human resources team: recruitment@pointclickcare.com

PointClickCare is committed to Information Security. By applying to this position, if hired, you commit to following our information security policies and procedures and making every effort to secure confidential and/or sensitive information.

Senior Research Data Engineer

Summary

Required Skills

Details

Description