function create_correlation_heatmap
Generates and saves a correlation heatmap visualizing the relationships between Eimeria infection indicators and performance measures from a pandas DataFrame.
/tf/active/vicechatdev/vice_ai/smartstat_scripts/5a059cb7-3903-4020-8519-14198d1f39c9/analysis_1.py
233 - 252
moderate
Purpose
This function creates a publication-quality correlation heatmap that specifically shows the correlations between two sets of variables: Eimeria infection indicators and performance measures. It computes the full correlation matrix, extracts the relevant subset showing cross-correlations between the two variable groups, and saves the visualization as a high-resolution PNG file. This is particularly useful for analyzing biological or veterinary data to understand how parasitic infections (Eimeria) correlate with various performance metrics.
Source Code
def create_correlation_heatmap(df, eimeria_vars, performance_vars):
"""Create correlation heatmap"""
all_vars = eimeria_vars + performance_vars
corr_matrix = df[all_vars].corr()
# Extract only Eimeria vs Performance correlations
corr_subset = corr_matrix.loc[eimeria_vars, performance_vars]
plt.figure(figsize=(12, 8))
sns.heatmap(corr_subset, annot=True, fmt='.3f', cmap='RdBu_r',
center=0, vmin=-1, vmax=1, cbar_kws={'label': 'Correlation Coefficient'})
plt.title('Correlation Heatmap: Eimeria Infection vs Performance Measures',
fontsize=14, fontweight='bold')
plt.xlabel('Performance Measures', fontsize=12)
plt.ylabel('Eimeria Infection Indicators', fontsize=12)
plt.tight_layout()
plt.savefig('correlation_heatmap.png', dpi=300, bbox_inches='tight')
print("\nSaved: correlation_heatmap.png")
plt.close()
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
df |
- | - | positional_or_keyword |
eimeria_vars |
- | - | positional_or_keyword |
performance_vars |
- | - | positional_or_keyword |
Parameter Details
df: A pandas DataFrame containing all the data. Must include columns matching the names specified in eimeria_vars and performance_vars parameters. The DataFrame should contain numeric data suitable for correlation analysis.
eimeria_vars: A list of column names (strings) from the DataFrame representing Eimeria infection indicators. These will appear as row labels in the heatmap. Must be valid column names present in df.
performance_vars: A list of column names (strings) from the DataFrame representing performance measures. These will appear as column labels in the heatmap. Must be valid column names present in df.
Return Value
This function does not return any value (returns None implicitly). Its primary output is a side effect: it saves a PNG file named 'correlation_heatmap.png' to the current working directory and prints a confirmation message to stdout.
Dependencies
pandasnumpymatplotlibseabornscipy
Required Imports
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
Usage Example
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Sample data
data = {
'eimeria_count': [100, 200, 150, 300, 250],
'eimeria_severity': [2, 3, 2, 4, 3],
'weight_gain': [1.2, 0.9, 1.1, 0.7, 0.8],
'feed_efficiency': [0.8, 0.6, 0.75, 0.5, 0.55],
'mortality_rate': [0.05, 0.10, 0.07, 0.15, 0.12]
}
df = pd.DataFrame(data)
eimeria_vars = ['eimeria_count', 'eimeria_severity']
performance_vars = ['weight_gain', 'feed_efficiency', 'mortality_rate']
create_correlation_heatmap(df, eimeria_vars, performance_vars)
# Output: Saves 'correlation_heatmap.png' and prints confirmation message
Best Practices
- Ensure all column names in eimeria_vars and performance_vars exist in the DataFrame to avoid KeyError
- Verify that the columns contain numeric data suitable for correlation analysis; non-numeric data will cause errors
- Handle missing values (NaN) in the DataFrame before calling this function, as they will affect correlation calculations
- Be aware that the function overwrites 'correlation_heatmap.png' if it already exists in the current directory
- Consider the number of variables being plotted; too many variables may result in an overcrowded heatmap that's difficult to read
- The function uses Pearson correlation by default (pandas .corr() method); consider if this is appropriate for your data distribution
- The figure is closed after saving (plt.close()), so it won't display interactively; remove this line if you need to display the plot
- The color scheme 'RdBu_r' is centered at 0, making it easy to distinguish positive (red) from negative (blue) correlations
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function calculate_correlations 75.9% similar
-
function create_grouped_correlation_plot 75.5% similar
-
function create_scatter_plots 74.7% similar
-
function grouped_correlation_analysis 71.2% similar
-
function main_v26 70.0% similar