
Uncovernew worldswith AI✨
Automatically detect exoplanets in NASA's astronomical datasets using our dual-pathway AI system. CTAB-GAN+CTAB-GAN+ and Masked AutoencodersMasked Autoencoders work together to achieve unprecedented accuracy in planetary discovery.
Confirmed Exoplanets
NASA Exoplanet Archive (pscomppars)
Transit Detection Accuracy
MAE Model Performance (Example)
Light Curves Analyzed
Kepler, K2 & TESS Missions
Why ExoNova?
ExoNova combines state-of-the-art machine learning architectures with NASA's comprehensive astronomical datasets to revolutionize exoplanet discovery.
Analyzes both time-series light curves and tabular parameters using MAE-based anomaly detection for comprehensive exoplanet identification.
Leverages vast unlabeled datasets from Kepler, K2, and TESS missions to learn robust stellar behavior patterns without extensive manual labeling.
Provides confidence scores and feature importance attribution for every prediction, enabling scientific validation and interpretation.
Freely accessible web interface for researchers and enthusiasts to upload data, receive classifications, and contribute to planetary discovery.
Interactive Detection Demo
Adjust astronomical parameters from real exoplanet observations to see how our MAE model calculates the probability of planetary transit detection.
Time when planet crosses star's center [Barycentric Julian Date]
Angular change in star's position [mas/year]
Star brightness in TESS bandpass [dimensionless]
Star radius relative to Sun [R☉]
Distance from Earth to star system [parsecs]
Planet radius relative to Earth [R⊕]
Transit Detection Probability
Unlikely Planet Probability
Adjust parameters and calculate
Model Architecture
ExoNova employs two complementary AI models: CTAB-GAN+ for synthetic data generation and missing value imputation, and Masked Autoencoder (MAE) for self-supervised anomaly detection in light curves and tabular data.
Transit Detection Rate
Dual-Model Approach
CTAB-GAN+ generates synthetic training data and handles missing values, while MAE performs self-supervised anomaly detection on both time-series light curves and tabular parameters.
Detection Pipeline Metrics (Example)
The Transit Method
The transit method is the most successful technique for detecting exoplanets. When a planet passes in front of its host star, it blocks a small fraction of the star's light, creating a temporary dip in brightness that we can measure.
Visualizing a Transit Event
Before Transit
The star emits constant light at full brightness. No planetary obstruction occurs, establishing the baseline luminosity.
During Transit
As the planet crosses in front of the star, it blocks a small fraction of light, creating a measurable brightness dip that reveals the planet's presence.
After Transit
The planet completes its crossing and the star returns to full brightness, creating a characteristic light curve pattern used for detection.
Key Detection Parameters
Brightness reduction during transit
Time planet crosses star
Time between transits
Why It Works
The transit method's success relies on precise photometric measurements. Even a small planet like Earth blocks approximately 0.01% of the Sun's light—a tiny but detectable signal.
By observing repeated transits, we can determine the planet's orbital period, size, and distance from its star. Combined with additional observations, this enables characterization of potentially habitable worlds.
ExoNova's Advantage: Our AI models can detect subtle transit patterns that might be missed by traditional analysis, increasing discovery rates and reducing false positives.
Technical Architecture
ExoNova's dual-pathway detection system combines two state-of-the-art AI models: CTAB-GAN+ for data quality enhancement and Masked Autoencoder for anomaly detection. This approach enables accurate exoplanet identification while handling the challenges of incomplete and imbalanced astronomical datasets.
CTAB-GAN+
Synthetic Data Generation & Imputation
Key Capabilities:
- •Generates physically plausible synthetic exoplanet samples
- •Handles missing values in observational data
- •Preserves correlations between astronomical parameters
- •Addresses class imbalance in training datasets
Architecture:
- → Conditional Generative Adversarial Network
- → Mixed data type handling (continuous & categorical)
- → Mode-specific normalization for tabular data
- → Classifier and regressor guidance for generation
Training Approach:
- → Adversarial training with generator-discriminator dynamics
- → Conditional generation based on exoplanet class labels
- → Regularization to enforce physical constraints
- → Progressive training for stability
Performance Metrics:
- → Classification accuracy improvement: Placeholder (TODO)
- → Data augmentation ratio: Placeholder (TODO)
- → Missing data imputation accuracy: Placeholder (TODO)
- → Correlation preservation score: Placeholder (TODO)
Masked Autoencoder (MAE)
Self-Supervised Anomaly Detection
Key Capabilities:
- •Detects transit signals in light curves
- •Learns from unlabeled stellar observations
- •Processes both time-series and tabular data
- •Provides reconstruction-based anomaly scores
Architecture:
- → Transformer-based encoder-decoder architecture
- → High masking ratio (75%) for efficient learning
- → Asymmetric design (lightweight decoder)
- → Positional embeddings for sequence data
Training Approach:
- → Self-supervised pre-training on unlabeled data
- → Random masking of input segments
- → Reconstruction loss minimization
- → Fine-tuning on labeled exoplanet samples
Performance Metrics:
- → Light curve anomaly detection: Placeholder (TODO)
- → False positive reduction: Placeholder (TODO)
- → Early discard efficiency: Placeholder (TODO)
- → Combined classification accuracy: Placeholder (TODO)
Detection Pipeline Flow
Our dual-pathway system processes astronomical data through both time-series analysis and tabular parameter evaluation, combining the strengths of CTAB-GAN+ for data quality enhancement and MAE for pattern recognition.
Data Sources & NASA Missions
ExoNova is trained on publicly available data from NASA's most successful exoplanet hunting missions, representing decades of astronomical observations and thousands of confirmed planetary discoveries.
Kepler Mission
Launched: 2009
2,662 Exoplanets
Confirmed Discoveries
NASA's first dedicated exoplanet hunting mission. Continuously monitored 150,000 stars in the Cygnus constellation, discovering thousands of planets and revolutionizing our understanding of planetary systems.
K2 Mission
Launched: 2014
500+ Exoplanets
Confirmed Discoveries
Extended Kepler mission using a modified observing strategy. Despite mechanical failures, K2 observed various star fields, discovering hundreds of additional planets and demonstrating mission adaptability.
TESS Mission
Launched: 2018
400+ Exoplanets
Confirmed Discoveries
Transiting Exoplanet Survey Satellite conducting an all-sky survey. TESS monitors 200,000+ nearby bright stars, discovering planets around our stellar neighbors and enabling detailed follow-up observations.
Training Datasets
Planetary Systems Composite Parameters
Comprehensive database of 6,022 confirmed exoplanets with detailed orbital and physical parameters from all NASA missions.
View Dataset API →TESS Objects of Interest (TOI)
Catalog of planetary candidates and false positives from TESS, providing crucial negative examples for balanced model training.
View TOI Documentation →Data Coverage
Time-series observations analyzed
Astronomical features tracked per object
From Kepler (2009) to TESS (present)
Mission Timeline
Open Data Philosophy
All training data comes from NASA's publicly accessible Exoplanet Archive, ensuring transparency and reproducibility. This open science approach allows researchers worldwide to validate our methods and contribute to planetary discovery.
Applications & Use Cases
ExoNova serves diverse audiences from professional astronomers to curious students, providing accessible AI-powered exoplanet detection capabilities for everyone.
For Researchers
Advanced astronomical data analysis
Key Capabilities:
- Automated dataset analysis with batch processing
- Confidence scoring for each detection
- Feature importance attribution for validation
- Export results in standard astronomical formats
Typical Workflow:
Upload → Analyze → Validate → Publish
For Educators
Engaging classroom demonstrations
Key Capabilities:
- Interactive demonstrations of transit method
- Visual explanations of detection algorithms
- Real-time parameter adjustment for learning
- Curriculum integration materials
Typical Workflow:
Demonstrate → Adjust → Explain → Engage
For Students
Hands-on learning experience
Key Capabilities:
- Accessible interface requiring no coding
- Instant feedback on parameter changes
- Learn by experimentation approach
- Understand real astronomical data
Typical Workflow:
Explore → Experiment → Learn → Discover
For Citizen Scientists
Contribute to planetary discovery
Key Capabilities:
- Upload your own astronomical observations
- Community collaboration and data sharing
- Contribute to open science initiatives
- Access to same tools as professionals
Typical Workflow:
Observe → Upload → Collaborate → Contribute
Real-World Impact
Universal Access to Exoplanet Science
Whether you're conducting cutting-edge research, teaching the next generation of scientists, or simply curious about planets beyond our solar system, ExoNova provides the tools and accessibility to make exoplanet detection available to everyone.
Our open-source platform eliminates barriers to entry while maintaining the rigor and accuracy required for professional astronomical research. Join our community and start your exoplanet discovery journey today.
How ExoNova Detects Exoplanets
Our end-to-end detection pipeline combines dual-modal anomaly detection with exploratory data analysis, processing your data through seven specialized stages to deliver accurate, scientifically validated exoplanet classifications with full explainability.
Data Ingestion
Upload light curve time-series and tabular astronomical parameters from your observations or NASA archives.
- Supports multiple data formats (CSV, FITS, JSON)
- Automatic format detection and validation
- Handles both complete and partial datasets
- Batch processing for multiple observations
Light Curve & Initial Tabular Analysis
MAE model processes both time-series photometric data and tabular parameters to detect anomalous patterns indicating potential transits.
- Self-supervised reconstruction learning on light curves
- Initial tabular data feature extraction
- Identifies periodic brightness dips and parameter anomalies
- Generates preliminary anomaly score (0-100)
Early Filtering
Configurable threshold discards obvious non-candidates, reducing computational load and focusing on promising detections.
- User-adjustable sensitivity threshold
- Trade-off between completeness and precision
- Filters out ~70-80% of non-planetary objects
- Configurable for discovery vs. validation mode
Parameter Extraction
Extracts astronomical features from surviving candidates and uses CTAB-GAN+ to handle missing values.
- Extracts 50+ physical and orbital parameters
- CTAB-GAN+ imputes missing values
- Preserves physical constraints and correlations
- Normalizes data for consistent analysis
Exploratory Data Analysis
Random Forest model performs feature importance analysis and exploratory classification on the tabular astronomical parameters.
- Feature importance ranking for each parameter
- Statistical correlation analysis
- Exploratory classification for data insights
- Identifies key discriminative features
Anomaly Detection
Masked Autoencoder performs final anomaly detection on combined features to predict exoplanet probability with confidence scoring.
- Multi-modal anomaly detection (light curves + tabular)
- Self-supervised deep learning analysis
- Exoplanet probability score (0-100%)
- Uncertainty quantification and confidence intervals
Classification & Results
Final determination combining all analyses with detailed feature importance, confidence metrics, and exportable scientific report.
- Classification: Confirmed / Candidate / False Positive
- Combined confidence score from all models
- Comprehensive feature importance attribution
- Exportable report (PDF, JSON, CSV)
Pipeline Performance Summary
Each stage is optimized for speed and accuracy, with automatic checkpoints and error handling to ensure reliable results. The entire pipeline can process hundreds of observations in parallel for large-scale surveys.