👋 Hi, I’m Kevin Mota da Costa
Data Scientist |
Machine Learning Researcher |
Python Developer |
Scientific Computing |
📌 About Me
I’m a data-driven researcher with a strong foundation in artificial intelligence, machine learning, and statistical modeling. My work focuses on applying modern data analysis techniques — such as Gaussian Processes, Neural Networks, and Bayesian inference — to extract insights from complex scientific datasets.
I enjoy building solutions that combine statistical rigor, efficient code, and insightful visualizations to solve real-world problems — especially in physics and observational sciences.
🎓 Academic Experience
Federal University of Espírito Santo (UFES), Brazil
-
B.Sc. in Physics (2020–2024)
- Focused on data analysis, machine learning, and computational modeling in scientific contexts.
-
Undergraduate Thesis
“Statistical and Artificial Intelligence Approaches for Cosmological Data Reconstruction”
- Applied Neural Networks and Gaussian Processes to reconstruct cosmological datasets.
- Demonstrated the effectiveness of AI techniques in handling and recovering observational data.
- Used Markov Chain Monte Carlo (MCMC) methods to estimate cosmological parameters (e.g., matter density, curvature, Hubble constant).
- Tools: Python, Scikit-learn, NumPy, Matplotlib, SciPy
-
Research in AI and Cosmology (Aug 2022 – Oct 2024)
- Investigated the use of supervised learning models (NNs, GPs) in predicting and modeling scientific phenomena.
- Emphasis on Bayesian machine learning for probabilistic reasoning and uncertainty estimation.
-
Cosmological Distance Estimation (2021–2022)
- Developed data analysis pipelines to estimate galactic distances from variable stars.
- Utilized public datasets (OGLE-IV) and implemented time-series analysis using Lomb-Scargle periodogram.
📄 Research Publication
“Determinação da Distância à Grande Nuvem de Magalhães Através das Estrelas Variáveis Cefeidas”
Kevin Mota da Costa, Alan Miguel Velásquez, Julio Cesar Fabris
📄 NASA ADS Abstract
- Used time-series data and statistical modeling to derive distance estimations.
- Built a custom Period-Luminosity relation using real data from 4700+ variable stars.
- Languages: Python, C++
- Machine Learning: Neural Networks, Gaussian Processes, MCMC, Regression, Classification
- Libraries: Scikit-learn, SciPy, NumPy, Matplotlib, Astropy, OpenCV
- Data Science: Data Cleaning, Feature Engineering, Statistical Modeling, Visualization
- Concepts: Bayesian Inference, Time-Series Analysis, Supervised Learning, Uncertainty Quantification
- Databases: SQL, Data Modeling, Relational Design
📁 Repository Structure
This repository is organized into several core areas of study and application:
- Neural Networks (Perceptron, MLP, CNN, RNN)
- Gaussian Processes and Bayesian inference
- Applications to scientific modeling and regression tasks
- SQL queries and relational design
- Data modeling and schema normalization
- Examples of real-world datasets and database projects
- Python and C++ fundamentals
- Classical algorithms (searching, sorting, pseudorandom generation)
- Numerical methods: interpolation, integration, linear systems
- Mathematical tools for computation
- Physics models and quantitative reasoning
- Statistical techniques, including Monte Carlo methods and inference theory
- End-to-end applications using the above techniques
- Integration of AI, mathematics, and statistical modeling in real scenarios
🌐 Links
- 🔗 NASA ADS Abstract
- 💻 GitHub Page: [CostaKevin.github.io]CostaKevin.github.io](https://costakevin.github.io/)
“Turning data into insight, and insight into understanding.”