Programming, Data, and IT

I am a lifelong learner with a passion for self-driven exploration and experimentation. One area I continually invest in is computer technology, where rapid advancements inspire me to stay current and adaptable. I take a proactive approach to learning, not only to keep pace with innovation but also to ensure that my knowledge translates into practical, tangible skills that enhance my work.

My journey into programming, data, and information technology began during my graduate studies at Carnegie Mellon University, where I took my first programming course in R. This experience sparked a keen interest in data, and I've since leveraged R extensively for statistical modeling and visualization.

Driven by this initial foray into coding, I pursued further learning through numerous online programming courses, broadening my skillset and deepening my understanding of software development principles. I also developed a foundational knowledge of machine learning, gaining practical experience in building and testing models. My technical proficiency encompasses a range of tools and technologies, including Python, which I use for its versatility in data manipulation and scripting, and SQL, enabling me to efficiently manage and query databases. For development, I primarily utilize VS Code, and I employ Docker for containerization. I am familiar with AWS for cloud infrastructure management, and Github for version control, facilitating collaborative development workflows. Google Colab serves as a valuable tool for cloud-based data analysis and machine learning experimentation. This combination of academic foundation, self-directed learning, and hands-on experience has equipped me with a comprehensive skillset in programming and information technology, allowing me to approach complex problems with confidence and adaptability.

Two key projects highlighting my data analysis and programming capabilities are presented below.

Data Visualization Project

In this project, I explored the principles of data visualization, understanding it as a tool to amplify cognition through visual representations of abstract data. I learned a structured process for data visualization, encompassing question formulation, data research and analysis, identifying key findings, customizing messaging for specific audiences, creating clear and accurate visualizations, and incorporating user feedback for refinement. I applied these principles by utilizing Tableau to visualize trends in foreign language learning within secondary and post-secondary education. This involved downloading and cleaning educational datasets, then creating visual representations to illustrate the patterns and insights derived from the data, effectively demonstrating the application of data visualization to communicate complex information.

Reflecting on this project, the process of transforming raw educational data into meaningful visualizations highlighted the critical role of each step in the data visualization workflow. From defining the research question to tailoring the presentation for the intended audience, each phase contributed to the clarity and impact of the final product. Using Tableau allowed me to experience firsthand how powerful visual tools can be in uncovering and communicating trends that might be obscured in raw data. Moreover, this experience underscored the importance of data cleaning and preparation, as the quality of the visualization is directly dependent on the integrity of the underlying data. The feedback-driven iteration process was also invaluable, reinforcing the idea that effective data visualization is an iterative endeavor, requiring continuous refinement to ensure clarity and relevance. This project not only enhanced my technical skills in data visualization but also deepened my understanding of how to effectively communicate complex information through visual means, emphasizing the importance of thoughtful design and audience consideration in instructional technology.

ML Project on Language Prediction

This project culminates my exploration of machine learning through the development of an AI-powered language prediction model. Utilizing TensorFlow and Keras within the Google Colab environment, I constructed a system capable of accurately identifying the language of user-inputted sentences. The project journey began with the crucial step of data acquisition, gathering diverse text samples in multiple languages (English, French, Portuguese, Spanish, Italian, German, and Swedish). This dataset served as the foundation for training my model. The core of the model is a neural network designed to process text sequences and output probabilities associated with each language. My code implements a text tokenizer, which converts the input sentence into numerical sequences that the model can understand. Then I pad the sequences to ensure they are all the same length. The trained model is then used to predict the language of the inputted sentence. The user is prompted to enter a sentence, which is then processed through the trained model. The output provides the predicted language along with the corresponding probability score, offering a clear and quantifiable measure of the model's confidence in its prediction.

This project served as a practical application of the machine learning concepts I've diligently studied and implemented in Colab. Building this language prediction model required a deep understanding of various machine learning principles, including data preprocessing, model architecture design, and performance evaluation. I leveraged TensorFlow and Keras to construct and train a neural network, gaining hands-on experience in building and deploying machine learning models.

View in Colab