Celebrating women and girls in science: An interview with Dr. Sevgui Erman

By: Ainsley Sullivan and Nashveen Mendes, Statistics Canada; Sevgui Erman, National Research Council of Canada

Introduction

In 2030, the United Nations is set to deliver on their 17 Sustainable Development Goals, with gender equality being one. February 11 marks the International Day of Women and Girls in Science (IDWGIS) to promote gender equality in science, technology, engineering and mathematics and the elimination of gender stereotypes. To promote this day, we interviewed Dr. Sevgui Erman, a prominent woman in data science whose career spans many sectors in the field of data science, digital technologies and analytics.

Dr. Sevgui Erman joined the National Research Council of Canada (NRC) in early 2022, and leads research programs and services in computer vision, natural language processing, advanced analytics, quantum computing and more. Her focus is to advance research and innovation in Canada, through accelerating discovery, enabling modelling and digital twins, multilingual text analytics, security and privacy. Before joining the NRC, Dr. Erman was Chief Data Scientist and Senior Director of the Data Science Division at Statistics Canada. She led the agency's Data Science Strategy and launched this Data Science Network for the Federal Public Service, which to date, includes more than 3,000 members. Dr. Erman holds a PhD from the University of Paris-Sud in Signal Processing and System Control, two areas that have strong links to artificial intelligence (AI).

Dr. Erman, how did you become interested in data science and what sparked your passion for this field?

Data Science was always present early on in my career and studies. I was often interested in optimizing processes and products through data and technology. My PhD in particular, was about robust engineering design; I used modelling to identify design parameters that led to optimizing a product's performance by minimizing the worst possible performance degradation due to environmental, manufacturing or other types of factors. In principle today, we apply a similar approach to new materials discoveries using AI, by efficiently searching the parameter space to find a new optimal design.

In 2005, I was awarded a patent in optical telecommunications. I'm particularly fond of this patent as the algorithm provided a substantial system performance improvement increasing the reach of the signal transmission. It was processing big data in real time, and it was implemented on more than one Nortel Optical Metro platform, at the time. This event sparked my passion.

At StatCan, I recall the challenges handling scanner data. In 2017, StatCan was dealing with tens of millions of records to be processed weekly. We were able to solve this issue by using AI. I also recall the excitement at the C2 conference in Montreal in June 2017, where I felt the vibrant AI ecosystem being formed in Canada. Data science was emerging as a powerful tool. Recently, a paper I co-authored was published in the Harvard Data Science Review and describes how data science enables innovation and supports the development of relevant and trusted statistical products.

I continue to be passionate about the work in this space. Data science continues to be a highly dynamic and constantly evolving space. The recent development of OpenAI's ChatGPT demonstrates the growth of this sector and what a fascinating time it is to interact with these powerful technologies and contribute to real-world solutions.

What, in your opinion, does the Government of Canada do well with data science? What are departments' main roadblocks in seeking to advance their abilities?

I believe the Government of Canada does well at advancing science with purpose. Throughout the pandemic, data science methods supported the delivery of high-quality and timely statistics. Collectively, our data science work focused on solving concrete problems and delivering practical results. At StatCan, data science is used to support the creation of new products and enable effective service delivery through automation. For example, extracting information from PDFs and other documentation can be a very time-consuming process, and StatCan is applying data science techniques to automate this information extraction process. Another example is the use of data science in construction and agriculture to detect the start of building construction or to identify greenhouse areas from satellite or aerial imagery. The long-term goal in these cases is to replace, in part, an existing survey and reduce response burden.

At the NRC, data science is used to contribute to new knowledge and intellectual property, economic growth and a healthier future for Canada. For instance, novel methods are used in bioinformatics for the discovery of drugs that inhibit cancer growth, or for developing pre-symptomatic diagnostics for age-related diseases such as dementia. Computer vision is used for automatic ergonomic and fatigue risk monitoring in advanced manufacturing, as well as real-time imaging as part of 3D printing in volumetric additive manufacturing.

In terms of roadblocks, more needs to be done to access talent. Due to the expansion of AI across industries, access to data scientists and data engineers continues to be limited.

Not a roadblock, but a challenge – the need to scale-up faster outputs. In this context collaboration is a must, we can't do this alone, we must work with other innovators, academia, businesses, and international partners. At the NRC for instance, this is done through pan-Canadian networks around Challenge programs and through collaboration centres at Canadian universities. We work together to find creative, relevant and sustainable solutions to Canadian challenges.

Computing infrastructure will also continue to be a strong focus in order for Canada to do well. Organizations continue to make progress on this front, such as developing cloud capabilities and on-site clusters. We need infrastructure that is scalable and can support multi-party collaborations between organizations.

Image of Sevgui Erman
Description: Image of Sevgui Erman

Text in image: "…get out of your comfort zone. This is important because it pushes you to learn and, most importantly, it augments the contributions you can make."

Sevgui Erman, Executive Director, Digital Technologies Research Center at the National Research Council of Canada

Looking to the future, what do you see as potential opportunities or challenges in the field of data science for Canada?

These are exciting times, and there are opportunities to create a better future for humanity. For instance, for ocean health preservation, we can use AI for pathogen contamination modelling or machine vision for the purposes of whale protection and tracking.

We have an opportunity for Canada to leverage diversity and provide organizations with unique perspectives and creative solutions. Data science teams are an excellent example of heterogeneous groups due to the multidisciplinary nature of the work. I have seen it at StatCan, and I am seeing it also at the NRC. For instance, in the Digital Technologies Research Centre, we have researchers with physics backgrounds applying signal processing methods to research in systems biology for the design of new molecules and drugs. Our Indigenous languages speech generation work brings together a diverse team that includes researchers from Indigenous communities. Their results indicate, firsthand, how diversity contributes to innovation. I believe this is a winning approach when building teams – to build diverse teams with various professional backgrounds, various races or cultures and genders, and individuals with disabilities that can come up with unique answers to challenging problems.

Ethical AI must be at the centre of our work – to deliver solutions that do not cause discrimination but reduce bias. It's important for data scientists to review results from different angles to identify potential harm and ensure there are benefits to Canadians. Transparency and accountability go hand in hand. In Canada, substantial progress has been made via the Treasury Board Secretariat directive for automated decision-making. StatCan's quality and machine learning guidelines are an excellent tool for the data practitioner. The recently tabled Bill-27 Digital Charter implementation plan is advancing Canada's commitment for responsible AI. At the NRC, we're developing a program to further advance transparency and explainability methods, as well as privacy preserving technologies research that enables work on more sensitive data sources.

What advice would you impart to those interested in pursuing a career in data science?

Keep the human factor at the core of everything you do. Your team, your colleagues, your partners. This is what matters the most, the ecosystem you operate in determines the impact you can create. Teamwork is essential. I'm grateful for the opportunities I've had and continue to work with talented, smart, and incredible crews and colleagues who are committed to advance science and create a positive impact for society.

I also believe that direct and honest conversations are a key ingredient for our success. There isn't anything more precious than an open discussion about the potential challenges, about the risks very early on in any endeavor and the joint work, to come up with meaningful solutions, leverage previous experiences and new ideas.

My advice to women in particular is that despite data science being a male-dominated field, women should still pursue it. Their diverse perspectives and life experiences contribute to a richer environment, generating new ideas and enabling creativity. I believe that we need to raise awareness about the opportunities for women and encourage girls to pursue careers and interests in science, technology, engineering, and mathematics.

Over the years, what valuable lesson stands out for you that you'd be able to share with our readers - be it personal, educational, or professional?

I would like to share two lessons. The first, is to get out of your comfort zone. This is important because it pushes you to learn and, most importantly, it augments the contributions you can make. One can start using a new tool, work in a new subject matter, use a new algorithm, these are all growth opportunities and are an essential ingredient of working in a fast-paced environment. I have worked in the private and public sectors, and academia, throughout telecommunications, statistical production, and IT. At each step along the way, I was open to learning different things and able to contribute effectively.

Each of these experiences contributed to shaping me as a professional and developing my leadership style. They allowed me to listen with empathy, be inclusive, and look for consensus-based solutions.

The second lesson – be a self-starter, take risks and navigate them. In fact this is what many researchers are encouraged to do early in their career while dealing with new technologies and working to advance science. A piece of literature that I would recommend that has inspired me at the time of the Data Science Accelerator creation in 2017, is called The Lean Startup by Eric Ries. It provides practical advice for rapid scientific experimentation embedded in the product development lifecycle, and on the premise that one can create a "startup" in any organization or environment.

Do you have inspirational words for budding data scientists?

I think it is important to enjoy the small things that make us happy. It is up to us to make space for these moments in our lives. Three years ago, I was diagnosed with breast cancer that was detected early. I benefited from excellent medical support and had fantastic support from colleagues. I also had my loving family by my side, so I was grateful for such a positive experience. The cancer provided me with a new perception of the world and allowed me, in fact, to focus on the things that matter the most. I stopped worrying about whether I was balancing well on all fronts. Instead, it allowed me to create a clear picture of what I wanted to make time for. My advice is to have the courage to not fit any specific mold and to be "off-balance." I also recommend the book Off Balance by Matthew Kelly.

Here are my parting words for young data scientists: What is your passion? What powers you and what, exactly, makes you dream? I encourage you to dream big as your passion is your fuel. You are a part of our future. You bring new energy and your work will enable the delivery of projects that will advance science and innovation. So, give yourself permission to hustle and be "off-balance."

Conclusion

Since March 2022, Dr. Erman leads NRC research programs in digital technologies, with a focus on accelerating scientific discovery and innovation in health, advanced manufacturing, blue economy and quantum. Dr. Erman is excited to continue to collaborate with her colleagues as they use data science to create meaningful impact for Canada and the world.

Subscribe to the Data Science Network for the Federal Public Service newsletter to keep up with the latest data science news.

Date modified: