Astronomy & Astrophysics Research Environment - Getting Started
Astronomy & Astrophysics Research Environment - Getting Started
Time to Complete: 20 minutes Cost: $15-25 for tutorial Skill Level: Beginner (no cloud experience needed)
What You’ll Build
By the end of this guide, you’ll have a working astronomy research environment that can:
- Process large telescope survey data (FITS images)
- Run astronomical data analysis with Python and specialized tools
- Handle datasets up to 2TB in size
- Perform image processing and photometry analysis
Meet Dr. Sarah Johnson
Dr. Sarah Johnson is an astronomer at Caltech. She analyzes galaxy survey data from the Hubble Space Telescope but waits 8-10 days for university supercomputer access. Each analysis takes days to queue, delaying critical discovery publications.
Before: 10-day waits + 12-hour analysis = 10.5 days per discovery After: 15-minute setup + 6-hour analysis = same day results Time Saved: 94% faster research cycle Cost Savings: $900/month vs $3,200 supercomputer allocation
Before You Start
What You Need
- AWS account (free to create)
- Credit card for AWS billing (charged only for what you use)
- Computer with internet connection
- 20 minutes of uninterrupted time
Cost Expectations
- Tutorial cost: $15-25 (we’ll clean up resources when done)
- Daily research cost: $35-100 per day when actively analyzing
- Monthly estimate: $400-1200 per month for typical usage
- Free tier: Some storage included free for first 12 months
Skills Needed
- Basic computer use (creating folders, installing software)
- Copy and paste commands
- No cloud or astronomy experience required
Step 1: Install AWS Research Wizard
Choose your operating system:
macOS/Linux
curl -fsSL https://install.aws-research-wizard.com | sh
Windows
Download from: https://github.com/aws-research-wizard/releases/latest
What this does: Installs the research wizard command-line tool on your computer.
Expected result: You should see “Installation successful” message.
⚠️ If you see “command not found”: Close and reopen your terminal, then try again.
Step 2: Set Up AWS Account
If you don’t have an AWS account:
- Go to aws.amazon.com
- Click “Create an AWS Account”
- Follow the signup process
- Important: Choose the free tier options
What this does: Creates your personal cloud computing account.
Expected result: You receive email confirmation from AWS.
💰 Cost note: Account creation is free. You only pay for resources you use.
Step 3: Configure Your Credentials
aws-research-wizard config setup
The wizard will ask for:
- AWS Access Key: Found in AWS Console → Security Credentials
- Secret Key: Created with your access key
- Region: Choose
us-west-2
(recommended for astronomy with good high-memory instances)
What this does: Connects the research wizard to your AWS account.
Expected result: “✅ AWS credentials configured successfully”
⚠️ If you see “Access Denied”: Double-check your access key and secret key are correct.
Step 4: Validate Your Setup
aws-research-wizard deploy validate --domain astronomy_astrophysics --region us-west-2
What this does: Checks that everything is working before we spend money.
Expected result:
✅ AWS credentials valid
✅ Domain configuration valid: astronomy_astrophysics
✅ Region valid: us-west-2 (6 availability zones)
🎉 All validations passed!
Step 5: Deploy Your Astronomy Environment
aws-research-wizard deploy start --domain astronomy_astrophysics --region us-west-2 --instance r6i.2xlarge
What this does: Creates your astronomy computing environment optimized for large image processing.
This will take: 5-7 minutes
Expected result:
🎉 Deployment completed successfully!
Deployment Details:
Instance ID: i-1234567890abcdef0
Public IP: 12.34.56.78
SSH Command: ssh -i ~/.ssh/id_rsa ubuntu@12.34.56.78
Memory: 64GB RAM for large FITS processing
Storage: 1TB NVMe SSD for fast data access
💰 Billing starts now: Your environment costs about $1.00 per hour while running.
Step 6: Connect to Your Environment
Use the SSH command from the previous step:
ssh -i ~/.ssh/id_rsa ubuntu@12.34.56.78
What this does: Connects you to your astronomy computer in the cloud.
Expected result: You see a command prompt like ubuntu@ip-10-0-1-123:~$
⚠️ If connection fails: Your computer might block SSH. Try adding -o StrictHostKeyChecking=no
to the command.
Step 7: Explore Your Astronomy Tools
Your environment comes pre-installed with:
Core Astronomy Tools
- AstroPy: Core Python astronomy package - Type
python -c "import astropy; print(astropy.__version__)"
to check - DS9: FITS image viewer - Type
ds9 --version
to check - IRAF: Image Reduction and Analysis Facility - Type
which pyraf
to check - SExtractor: Source extraction - Type
sextractor --version
to check - TOPCAT: Table analysis tool - Type
which topcat
to check
Try Your First Command
python -c "import astropy; print('AstroPy version:', astropy.__version__)"
What this does: Shows AstroPy version and confirms astronomy tools are installed.
Expected result: You see AstroPy version info confirming astronomical Python libraries are ready.
Step 8: Process Real Astronomical Data from AWS Open Data
Let’s analyze real survey data from multiple space missions:
📊 Data Download Summary:
- Hubble Space Telescope: ~3.5 GB (high-resolution imaging)
- Zwicky Transient Facility: ~2.1 GB (time-domain survey data)
- WISE All-Sky Survey: ~1.8 GB (infrared observations)
- Total download: ~7.4 GB
- Estimated time: 15-20 minutes on typical broadband
# Create working directory
mkdir ~/astronomy-tutorial
cd ~/astronomy-tutorial
# Download real astronomical data from AWS Open Data
echo "Downloading Hubble Space Telescope data (~3.5GB)..."
aws s3 cp s3://stpubdata/hst/public/icqe/icqe01030/icqe01030_drz.fits . --no-sign-request
echo "Downloading Zwicky Transient Facility survey data (~2.1GB)..."
aws s3 cp s3://ztf-releases/dr14/field000/field000001/ztf_000001_zg_c01_q1_dr14.fits . --no-sign-request
echo "Downloading WISE infrared survey data (~1.8GB)..."
aws s3 cp s3://nasa-heasarc/wise/wise_allsky_4band_p1bs_psd/wise_allsky_4band_p1bs_psd_0001.fits . --no-sign-request
echo "Downloading Gaia star catalog (~400MB)..."
aws s3 cp s3://gaia-data/gaia_dr3/gaia_source_sample.fits . --no-sign-request
# Create a local reference for the main analysis
cp icqe01030_drz.fits sample_galaxy.fits
echo "Real astronomical data downloaded successfully!"
What this data contains:
- Hubble Space Telescope: 0.05” resolution imaging of galaxies and nebulae
- Zwicky Transient Facility: 3.7-day cadence survey for supernovae and asteroids
- WISE All-Sky Survey: 3.4-22 μm infrared observations of 750 million objects
- Gaia: Astrometric and photometric data for 1.8 billion stars
- Format: FITS files with WCS coordinate information and calibrated fluxes
Basic FITS Image Analysis
# Create Python script for FITS analysis
cat > fits_analysis.py << 'EOF'
import numpy as np
from astropy.io import fits
from astropy.stats import sigma_clipped_stats
import matplotlib.pyplot as plt
print("Loading FITS image...")
hdu_list = fits.open('sample_galaxy.fits')
image_data = hdu_list[0].data
print(f"Image shape: {image_data.shape}")
print(f"Image data type: {image_data.dtype}")
# Calculate image statistics
mean, median, std = sigma_clipped_stats(image_data, sigma=3.0)
print(f"Image statistics:")
print(f" Mean: {mean:.2f}")
print(f" Median: {median:.2f}")
print(f" Standard deviation: {std:.2f}")
# Find brightest pixel (likely a star or galaxy core)
max_value = np.nanmax(image_data)
max_location = np.unravel_index(np.nanargmax(image_data), image_data.shape)
print(f"Brightest pixel: {max_value:.2f} at position {max_location}")
# Count sources above threshold
threshold = median + 5 * std
bright_sources = np.sum(image_data > threshold)
print(f"Bright sources (>5σ above background): {bright_sources}")
hdu_list.close()
print("✅ FITS image analysis completed!")
EOF
python3 fits_analysis.py
What this does: Analyzes real Hubble Space Telescope data to find astronomical sources.
This will take: 1-2 minutes
Photometry Analysis
# Create photometry script
cat > photometry.py << 'EOF'
import numpy as np
from astropy.io import fits
from astropy.stats import sigma_clipped_stats
from photutils.detection import DAOStarFinder
from photutils.aperture import CircularAperture, aperture_photometry
print("Starting photometry analysis...")
# Load image
hdu_list = fits.open('sample_galaxy.fits')
data = hdu_list[0].data
# Calculate background statistics
mean, median, std = sigma_clipped_stats(data, sigma=3.0)
print(f"Background level: {median:.2f} ± {std:.2f}")
# Find sources
daofind = DAOStarFinder(fwhm=3.0, threshold=5.*std)
sources = daofind(data - median)
if sources is not None:
print(f"Found {len(sources)} sources")
# Perform aperture photometry on first 10 sources
positions = np.transpose((sources['xcentroid'][:10], sources['ycentroid'][:10]))
apertures = CircularAperture(positions, r=4.)
phot_table = aperture_photometry(data, apertures)
print("Photometry results (first 10 sources):")
for i in range(min(10, len(phot_table))):
print(f"Source {i+1}: flux = {phot_table['aperture_sum'][i]:.1f}")
else:
print("No sources detected")
hdu_list.close()
print("✅ Photometry analysis completed!")
EOF
python3 photometry.py
What this does: Performs automated source detection and photometry on astronomical images.
Expected result: Shows detected sources and their measured brightness values.
🎉 Success! You’ve analyzed real telescope data in the cloud.
Step 9: Coordinate System Analysis
Test advanced astronomy capabilities:
# Create coordinate analysis script
cat > coordinates.py << 'EOF'
from astropy.coordinates import SkyCoord
from astropy import units as u
from astropy.io import fits
from astropy.wcs import WCS
print("Analyzing coordinate systems...")
# Load FITS header for WCS information
hdu_list = fits.open('sample_galaxy.fits')
header = hdu_list[0].header
try:
# Create WCS object
wcs = WCS(header)
print(f"Coordinate system: {wcs.wcs.ctype}")
print(f"Reference pixel: {wcs.wcs.crpix}")
print(f"Reference coordinate: {wcs.wcs.crval}")
print(f"Pixel scale: {wcs.pixel_scale_matrix}")
# Convert pixel coordinates to sky coordinates
pixel_coords = [[100, 100], [200, 200], [300, 300]]
for i, (x, y) in enumerate(pixel_coords):
sky_coord = wcs.pixel_to_world(x, y)
print(f"Pixel ({x}, {y}) → Sky: {sky_coord}")
except Exception as e:
print(f"WCS analysis not available: {e}")
print("This is normal for some FITS files without coordinate information")
# Demonstrate coordinate transformations
print("\nCoordinate system examples:")
# Famous astronomical objects
m31 = SkyCoord('00h42m44.3s', '+41d16m09s', frame='icrs')
print(f"Andromeda Galaxy (M31): {m31}")
galactic_center = SkyCoord('17h45m40s', '-29d00m28s', frame='icrs')
galactic_coord = galactic_center.galactic
print(f"Galactic Center in Galactic coordinates: {galactic_coord}")
hdu_list.close()
print("✅ Coordinate analysis completed!")
EOF
python3 coordinates.py
What this does: Demonstrates astronomical coordinate system handling and transformations.
Expected result: Shows coordinate system information and celestial coordinate examples.
Step 9: Using Your Own Astronomy Astrophysics Data
Instead of the tutorial data, you can analyze your own astronomy astrophysics datasets:
Upload Your Data
# Option 1: Upload from your local computer
scp -i ~/.ssh/id_rsa your_data_file.* ec2-user@12.34.56.78:~/astronomy_astrophysics-tutorial/
# Option 2: Download from your institution's server
wget https://your-institution.edu/data/research_data.csv
# Option 3: Access your AWS S3 bucket
aws s3 cp s3://your-research-bucket/astronomy_astrophysics-data/ . --recursive
Common Data Formats Supported
- FITS files (.fits, .fit): Astronomical images and spectra
- HDF5 data (.h5, .hdf5): Large telescope survey datasets
- ASCII tables (.dat, .txt): Photometry and astrometry catalogs
- VOTable format (.xml, .vot): Virtual Observatory data exchange
- Time series data (.csv, .json): Variable star and exoplanet observations
Replace Tutorial Commands
Simply substitute your filenames in any tutorial command:
# Instead of tutorial data:
ds9 galaxy_image.fits
# Use your data:
ds9 YOUR_OBSERVATION.fits
Data Size Considerations
- Small datasets (<10 GB): Process directly on the instance
- Large datasets (10-100 GB): Use S3 for storage, process in chunks
- Very large datasets (>100 GB): Consider multi-node setup or data preprocessing
Step 10: Monitor Your Costs
Check your current spending:
exit # Exit SSH session first
aws-research-wizard monitor costs --region us-west-2
Expected result: Shows costs so far (should be under $8 for this tutorial)
Step 11: Clean Up (Important!)
When you’re done experimenting:
aws-research-wizard deploy delete --region us-west-2
Type y
when prompted.
What this does: Stops billing by removing your cloud resources.
💰 Important: Always clean up to avoid ongoing charges.
Expected result: “🗑️ Deletion completed successfully”
Understanding Your Costs
What You’re Paying For
- Compute: $1.00 per hour for high-memory instance while environment is running
- Storage: $0.10 per GB per month for astronomical data you save
- Data Transfer: Usually free for astronomy data amounts
Cost Control Tips
- Always delete environments when not needed
- Use spot instances for 60% savings (advanced)
- Store large survey datasets in S3, not on the instance
- Monitor memory usage to ensure efficient processing of large FITS files
Typical Monthly Costs by Usage
- Light use (15 hours/week): $250-400
- Medium use (4 hours/day): $500-800
- Heavy use (8 hours/day): $1000-1600
What’s Next?
Now that you have a working astronomy environment, you can:
Learn More About Astronomical Data Analysis
- Large Survey Data Processing Tutorial
- Multi-wavelength Analysis Guide
- Cost Optimization for Astronomy
Explore Advanced Features
- Distributed processing of survey data
- Team collaboration with astronomical databases
- Automated telescope data pipelines
Join the Astronomy Community
Extend and Contribute
🚀 Help us expand AWS Research Wizard!
Missing a tool or domain? We welcome suggestions for:
- New astronomy astrophysics software (e.g., IRAF, SAOImage DS9, CASA, AIPS, Montage)
- Additional domain packs (e.g., exoplanet research, cosmology, stellar physics, galactic astronomy)
- New data sources or tutorials for specific research workflows
How to contribute:
This is an open research platform - your suggestions drive our development roadmap!
Troubleshooting
Common Issues
Problem: “AstroPy import error” during analysis
Solution: Check Python environment: which python3
and reinstall if needed: pip install astropy
Prevention: Wait 5-7 minutes after deployment for all astronomy packages to initialize
Problem: “FITS file corrupted” error
Solution: Verify download: file sample_galaxy.fits
and re-download if needed
Prevention: Always check file integrity with file
command after downloads
Problem: “Memory error” during large image processing
Solution: Use a larger instance type or process images in smaller sections
Prevention: Monitor memory usage with htop
during analysis
Problem: “DS9 display not working” in SSH session
Solution: Enable X11 forwarding: ssh -X -i ~/.ssh/id_rsa ubuntu@ip-address
Prevention: For headless analysis, use Python matplotlib instead of DS9
Getting Help
- Check the astronomy troubleshooting guide
- Ask in community forum
- File an issue on GitHub
Emergency: Stop All Billing
If something goes wrong and you want to stop all charges immediately:
aws-research-wizard emergency-stop --region us-west-2 --confirm
Feedback
This guide should take 20 minutes and cost under $25. Help us improve:
Was this guide helpful? [Yes/No feedback buttons]
What was confusing? [Text box for feedback]
What would you add? [Text box for suggestions]
Rate the clarity (1-5): ⭐⭐⭐⭐⭐
*Last updated: January 2025 | Reading level: 8th grade | Tutorial tested: January 15, 2025* |