Faculty
H. Andrew Schwartz is director of the Human Language Analysis Beings (HLAB) housed in the Computer Science Department at Stony Brook University (SUNY). His interdisciplinary research focuses on human-centered natural language processing for the health and social sciences. Andrew is also a PI/co-founder for the World Well-Being Project, a multi-disciplinary consortium between the University of Pennsylvania, Stony Brook University, and Stanford University focused on developing large-scale language analyses that reveal and predict differences in health, personality, and well-being. Andrew is an active member in the fields of AI-natural language processing, psychology, and health informatics. He is the creator and one of the maintainers of the Differential Language Analysis ToolKit (DLATK), used in over 100 studies and by a variety of tech organizations. He received his Ph.D. in Computer Science from the University of Central Florida in 2011 with research on acquiring common sense knowledge from the Web. He is a 2020 Recipient of a DARPA Young Faculty Award. His research often attracts public interest with articles in, e.g., The New York Times, USA Today, and The Washington Post. He enjoys his family, research, and disc golf.
PhD Students
Vasudha is a fourth-year Ph.D. student working in the areas of natural language processing and the relationships between cognition and language, especially in the context of social media. She is particularly interested in understanding the cognitive style of users through their language usage. She is also interested in interpretability and explainability of language models, multilingual models for NLP and generalizable language modeling.
Adi is a computer science PhD student working with Prof. Schwartz in NLP problems with a human focus. His recent and ongoing projects inform the computational social science community of effective ways to use state-of-the-art NLP tools for human-focused NLP problems. His contributions to open-source have helped this community to conduct faster research on data at scale without being limited by inexperience in programming. During his bachelor’s, his research in time-series focused on non-stationary time-series in volatile systems. He also served as the first data scientist at Motorq (a connected car data platform), where he identified and solved some of the primal problems for the company. From Fall 2021, he will be a Ph.D. student at Stony Brook advised by Dr. Schwartz. During the school days, you can find him in his natural habitat for most of the week: Room 242 (Data Science and Analysis lab), New CS Building.
I am enthused about NLP's expanding outreach in our lives and it's unexplored abilities to understand human nature better and more efficiently than ever. Language is more than words, it expresses identities, psychologies, cultures and much more. I find myself challenged in directing NLP language models to look beyond the current limitations and consider the human behind the language. The purpose of my research is to enable the growth of more empathy in an AI-future centric world, thereby augmenting humanity rather than detracting from it.
Sid Mangalik is a Ph.D. student in natural language processing at Stony Brook University working with Andy Schwartz. His previous studies focused on the intersection of artificial intelligence and psychology, where he worked with Ritwik Banerjee on language use in scientific writing and among abusers. His current research is working on identifying connections between community-level language and psychological outcomes. He is the lead data scientist at Monitaur, a start-up focused on making machine learning models fairer, safer, and more transparent. In his free time Sid writes music and released his first album in 2020.
Swanie Juhng is a third-year PhD student doing research in natural language processing and psychology. Her research interest lies in utilizing linguistic cues and nonlinguistic variables to predict and analyze the mental health problems, specifically anxiety and depression.
Aaron Marker
Aaron Marker is a second-year PhD student in computer science, working with Professor Schwartz on natural language processing (NLP) research, particularly focused on fairness and bias in NLP systems. He is currently evaluating predictive models to identify demographic biases. Aaron holds a BS with a double major in Computer Science and Cybersecurity from Randolph Macon College, where he also minored in math and ethics. Beyond his interest in NLP and AI, Aaron enjoys art and spends his spare time painting and sketching.
Scott Feltman
Scott Feltman is a fourth-year PhD student in Applied Mathematics and Statistics, specializing in statistical methods in computational biology working with Professor Schwartz. Currently, he is working to bridge the gap between state-of-the-art mathematical models with powerful transformers and other deep learning architectures for time-series data in the field of psychiatry. This union, in addition to the inclusion of natural language data, would allow for robust forecasting of stress levels in patients diagnosed with PTSD. When he's not working, he relaxes by playing the guitar, enjoying his passion for music.
Research Staff
Syeda Mahwish
Senior Research Coordinator
Syeda is the Senior Research Coordinator at HLAB. She received her Master’s in Public Health with a concentration in epidemiology and biostatistics at Stony Brook School of Medicine. Her multidisciplinary interests in psychology, public health, and machine learning have led her to explore these various topics. Within HLAB, Syeda works on several projects such as alcohol abuse disorder, cognitive dissonance, and her independent study involving Resilience and PTSD. Her research interests include resilience, stress, depression, coping, and alcohol abuse disorder. Syeda hopes to pursue a PhD in Clinical Psychology. smahwish@cs.stonybrook.edu
Postdocs
Developing ways to measure, describe and differentiate psychological constructs using Natural Language Processing and Machine Learning. He is particularly interested in measuring psychological well-being including harmony in life and satisfaction with life. Recently, Oscar and HLAB members have developed an r-package called Text for analyzing text using NLP and deep learning: https://r-text.org/
Visitors
Associate Research Professor and Principal Research Scientist
Dr. Ryan L. Boyd was an Associate Research Professor and the Principal Research Scientist for the HLAB during his time at Stony Brook. He is now an Assistant Professor in the Department of Psychology, School of Behavioral and Brain Sciences, at University of Texas at Dallas. He is a psychologist and computational social scientist — his research research spans topics ranging from personal to society-level social processes, including mental health, motives, emotion, human sexual behavior, and storytelling. He has authored dozens of free, open-source text analysis applications for social scientists and is one of the lead scientists behind Linguistic Inquiry and Word Count (LIWC), the most well-validated and widely-used social science text analysis application. Dr. Boyd has served on multiple editorial boards in the fields of Psychology and AI, and his work has been featured by numerous popular media, including The New York Times, the BBC, CNN, and the Washington Post. He is also a huge fan of good coffee, pizza, and big ol' friendly dogs.
Lucie Flek is a full time professor at the University of Bonn, leading the research group on Conversational AI and Social Analytics (CAISA). Her research interests revolve around machine learning applications in Natural Language Processing (NLP), particularly in user modeling and stylistic variation. She investigates how language usage varies among individuals and sociodemographic groups, using this variation to predict in-group behavior in machine learning tasks. Her work also delves into bias in NLP, ethics issues, and performance of models on underrepresented groups. Lucie's Ph.D. focused on meaning ambiguity, incorporating expert lexical-semantic resources into DNN classification tasks. She actively collaborates across disciplines and has experience in industry projects related to limited training data scenarios. Lucie has a background in particle physics research and is passionate about cross-disciplinary collaborations.
Allie received her PhD under the advisory of Prof. Lucie Flek at the University of Bonn, Germany in the Conversational AI and Social Analytics (CAISA) lab, studying natural language processing (NLP). Allie's research focuses on computational social science and conversation dynamics in goal-oriented settings and interpersonal interactions involving particular social intents, such as persuasion and offering support. Her work with various domains and contexts, such as online forums for supportive and opinionated interactions, and clinical conversations, in which she investigates stance dynamics and empathetic interactions, and transfer learning to better model social NLP tasks. She is also interested in developing theory-driven empathy research approaches for NLP that consider the complex affective and cognitive processes and social factors that influence empathetic expression and perception.
MS Students
Avish Parmar
Master's Student Fall 2023
Research Project: LFactor
Akshay Raghavan
Master's Thesis Fall 2024
Khushboo Singh
Master's Thesis 2024
Investigating the Efficacy of Large Language Models in Understanding Human Psychological States and Traits
Rajath Rao
Master's Student Fall 2024
Researching shallow-fusion techniques for speech processing, multimodal models for down-stream tasks, and a hint of brain computer interfaces.
Dhruv Kunjadiya
Benchmarking HaRT across downstream tasks and evaluating its effectiveness under different configurations
Pranav Chitale
Undergraduates
Phd, PostDoc and Research Staff Alumni
Natural language processing with particular care for areas involving time-series analysis, social media, mental well-being, and computational social science. Additionally, I have a broader interest in applied machine learning as a whole for tasks related to vision and computer networking.
The intersection of computer science and psychology: I love applying data science and natural language processing methods to analyze human thoughts, their characteristics and behaviors. Isn’t it cool to transform a person’s thought, something very abstract, into a concrete numeric vector space and then manipulate, analyze these vectors. Finally, map them back to the human’s thoughts space to predict their next behavior, or to predict their mental health state? My projects involve analyzing social media posts to understand the general public’s major beliefs on specific pre-defined topics.
Salvatore Giorgi is a data scientist working under Dr. Brenda Curtis at the National Institute on Drug Abuse (NIDA). He received his Ph.D. in Computer Science at the University of Pennsylvania working under Dr. H. Andrew Schwartz and Dr. Lyle Ungar. His research interests include machine learning applications to substance use and recovery, as well as relationships between individuals and their communities as expressed through language on social media.
Research Scientist at Ebay
Computational social science and natural language processing; Social media language analyses for psychological health and well-being; Integrating language and extra-linguistic data; social networks and graph mining
Natural Language Processing (NLP) for social media analysis. I especially focus on discourse relation parsing to extract key information for targeted tasks such as opinions and reasons for a political stance or sentiment, and finding the correlations of discourse styles with human variables such as personality. I collaborate with psychologists and computational linguists for Human-centered language modeling to obtain higher accuracies of various NLP tasks from traditional tasks (e.g., sentiment analysis) to novel tasks such as discourse style analysis for psychological assessment and well-being measurement.
Now at Facebook Research, Seattle.
My interest is primarily in natural language processing, with some overlap into data mining and artificial intelligence. In addition to computer science, I am interested in the humanities and social sciences, particularly psychology, linguistics, and classics, and am drawn to projects that are interdisciplinary in nature. My long-term goal is to pursue a career doing NLP research, hopefully both in industry and academia.
My interest is primarily in natural language processing, with some overlap into data mining and artificial intelligence. In addition to computer science, I am interested in the humanities and social sciences, particularly psychology, linguistics, and classics, and am drawn to projects that are interdisciplinary in nature. My long-term goal is to pursue a career doing NLP research, hopefully both in industry and academia.
Julia Buffolino
Research Coordinator
Julia is the Research Coordinator of the HLAB. She received her BA from Stony Brook University in 2019, majoring in Psychology and concentrating in Sociology. After graduation, she worked in the market research industry, with her focus being on medical devices, pharmaceuticals, and consumer packaged goods. While she enjoyed the work, she felt a drive to pursue her interest in mental health, and made the decision to return to academic research. Her areas of interest include mood disorders and health psychology. She hopes to pursue a graduate degree, to work with these interests in her future profession.
julia.buffolino@alumni.stonybrook.edu
julia.buffolino@alumni.stonybrook.edu
Now a Postdoc at Stanford University. My research interests are at the intersection of natural language processing and computational social science. In particular, I am focused on making NLP models human-centric, socially aware, and cognizant of linguistic variation. I am also interested in applying NLP methods to uncover social biases through the lens of natural language.
Weixi Wang
weixiwang@cs.stonybrook.edu
Weixi is the former research coordinator of the HLAB. He received his BA in Sociology from the University of Texas at Austin. His research interests include stress, trauma exposure, PTSD, and related comorbidities (e.g. substance use disorder). The goal is to improve the treatment of trauma-related disorders to promote healing and empowerment among trauma survivors. He plans to pursue a Ph.D. in Clinical Psychology and a career as a researcher.
Shailen was an undergraduate senior researching human-centered NLP with Andy Schwartz. He is interested in explainability, understanding human cognition, and democratizing education with language models. He is excited by how understandings in human cognition can motivate improvements in language models. On campus, he mentors tutors at the SBU Academic Success and Tutoring Center, and he is Treasurer for the Stony Brook Environmental Club. During Summer 2023, he studied mastery learning algorithms for intelligent tutoring systems with Prof. Neil Heffernan at the Learning Sciences and Technologies Lab at WPI. He is now a applying for CS PhD programs in NLP starting in Fall 2024.
Nicolas M. Legewie received his MA and PhD in social science from Humboldt University of Berlin. Currently, he is a postdoctoral visiting fellow at the Sociology Department, University of Pennsylvania. His research focuses on the role of social environments, such as personal and neighborhood networks, on educational and occupational attainment, and on upward mobility. He also writes and teaches about migration, the life course, research ethics, and research methodology such as video data analysis, mixed methods, and digital social science research. In a current project with H. Andrew Schwartz (Stony Brook) and Salvatore Giorgi (UPenn), he uses quantitative text analysis of large-scale geo-coded Twitter data, in combination with county-level census data, to study the impact of heterogeneity in cultural models of education and occupation in counties on individuals’ college enrollment and completion.
MS Alumni
Theodore Grossberndt
Master's Student
Speech to Text
Yejin Lee
Research Project: Understanding adolescent well being on Social Media
Himanshu Chaoudhary
Master's Student
Research Project: Sparkifying Transformer-based (LLM) Embeddings
Shreyashee Sinha
Research Project: Social Media Auto Regressive Transformer model
Gourab Dey
Master's Student
Research Project: Social Media Transformers: Socialite Lamma2
Manal Shah
Research Project: Python application to embed large scale language data on Hadoop clusters
Farhan Ahmed
Farhan is a BS/MS alumni who studied computer science and applied mathematics working with Professor Andrew Schwartz. Farhan is interested in natural language processing and data science. His current research focuses on reducing selection bias from social media in making predictions for mental health and well-being.
Sumit Agarwal
Kanishta Agarwal
Research Project: Mood Forecasting: Using ARIMA to Predict Future Affect
Pooja Aravinder
Nipun Bayas
Research Project: Optimism in Social Media
Austin Borger
Mallikarjuna Budida
Research Project: BERT Feature Extraction on TPU
Swatilekha Chaudhury
Research Project: NCDS Longitudinal Essays
Pooja Dalaya
Research Project: Age and Income Weights for Sample Bias Correction
Pulkit Dongle
Research Project: DLATK: DeMySQLfying
Neelaabh Gupta
Research Project: Predicting Physical Activity using DLA
Omkar Kanade
Master's Student Fall 2023
Research Project: Improving IRT for Adaptive Language-based MH Assessment
Keshav Gupta
Research Project: Sample Bias Correction
Deepak Gupta
Research Project: LexHub Development
Emil Joswin
Research Project: Permutation Language Modeling and Data Collator for XLNet
Akash Idnani
Research Project: Latent Traits Exploration Extended Multi-Domain Single Factor
Kiranmayi Kasarapu
(MS, 2018; Now at Amazon)
Research Project: Social Mobility Prediction
Adarsh Kashyap
Research Project: Health Care Utilization
Parth Limbachiya
Research Project: Implementation of CoxPh Model
Rowan Menezes
Research Project: Brands Across Years
Sourav Mishra
Anvesh Myla
Research Project: Quantitative Evaluation of Interpretability of Latent Factors
Mihir Parulekar
Research Project: Privacy Analysis of BERT Embeddings
Sania Parveen
Research Project: BERT Feature Extraction
Adarsh Prabhakara
Aman Raj
(MS, 2017; Now at Google)
Research Project: Large-scale Social Media Assessment.
Aravind Reddy
Research Project: Adolescent Depression - Longitudinal Representations
Damayanti Sengupta
Research Project: Social Media and Mental Health
Deven Shah
(MS, 2018; Now at Yahoo Research)
Research Project: Human Bias in Predictive Models
Research Project: Human Bias in Predictive Models
Swetambari Verma
(MS, 2018; Now at Facebook)
Research Project: Assessing Income and Education from Social Media Language
Jihu Mun
(Undergraduate; investigating the reliability of distress scores of Twitter users across different spatial and temporal resolutions.)
Anthony Xiang
(MS; music and audio representation, NLP)