1 Background

What information can we get from looking at the Montipora capitata embryos that were exposed to varying levels of PVC leachate (High: 1mg/L, Mid: 0.1mg/L, Low: 0.01mg/L)? This experiment allowed embryos to develop from a bundle-bundle cross (meaning one egg-sperm bundle from one parent colony was placed in a 20mL scintillation vial with another egg-sperm bundle from a different parent colony). Colonies are not known to be able to self-fertilize… meaning it’s important that each bundle came from a different parent colony. Each Montipora capitata egg sperm bundle contains a ‘bundle’ of eggs that are bound together around a ‘packet’ of sperm.

Note

Each egg-sperm bundle contains 15 +/- 5.1 oocytes (mean +/- SD, n = 214, from 26 colonies) (Padilla-Gamiño et al. 2011)

Padilla-Gamiño, J. L., T. M. Weatherby, R. G. Waller, and R. D. Gates. 2011. “Formation and Structural Organization of the Egg–Sperm Bundle of the Scleractinian Coral Montipora Capitata.” Coral Reefs 30 (2): 371–80. https://doi.org/10.1007/s00338-010-0700-8.

This means that each bundle-bundle cross, if there was 100% fertilization success, could result in 18 - 42 maturing embryos.

Question # 1: Do we use 100% fertilization success as our ‘baseline’ to compare all treatment groups (including the control?). This could provide insight into fertilization success ratios for bundle-bundle cross experiments, but may not be applicable to oceanic conditions, where there is sperm competition, mixing, and greater potential for dilution

Answer: We will only assess fertilization success at the point of cleavage, where presumably, we will still see the unfertilized oocytes (any eggs that have not cleaved). In the prawn chip and early gastrula phases, the unfertilized eggs have already dissolved. In these phases, we will assess embryo mortality based on the fertilization rates we find from the cleavage data.

Question #2: How should we treat fragmented embryos? I can tell by the cell structure that they made it to a certain phase… but I can’t tell how many fragments make up one whole embryo (count data). Should we count the fragmented embryos? Are they fragmented, or are they deformed (abnormalities?)

variable	unit/type	statistical test
Developmental stage	cell stage, ordinal categorical variable	Chi-Square Goodness of Fit or Log-Likelihood Ratio
proportion of abnormality per cross	ratio, continuous variable	Log-Likelihood Ratio
proportion of abnormality per treatment	ratio, continuous variable	Log-Likelihood Ratio
proportion of fertilization success	ratio, continuous variable	Log-Likelihood Ratio

For every embryo in every sample we have:

Embryo stage (cell stage, ordinal categorical variable)
Embryo status (typical/malformed, binary categorical variable)

With this data we can assess for each sample:

Embryo counts (counts of embryos in each stage and status)
Embryo proportions (proportions of embryos in each stage and status)

Across samples we want to analyze for:

Timing
Abnormality
Survival

2 Examples from the literature

2.1 Hédouin and Gates (2013)

Hédouin, Laetitia, and Ruth D. Gates. 2013. “Assessing Fertilization Success of the Coral Montipora Capitata Under Copper Exposure: Does the Night of Spawning Matter?” Marine Pollution Bulletin 66 (1): 221–24. https://doi.org/10.1016/j.marpolbul.2012.11.020.

looked at initial cleavage (3 hpf)
bundle-bundle cross in 10mL of 0.45um FSW

” After preservation, eggs were examined for the fertilization (showing normal cleavage) and the proportion of successful fertilization within treatment determined. Eggs showing no signs of cleavage were scored as unfertilized.

The proportions of fertilized eggs were used to estimate EC₅₀, the concentration of copper that reduced the fertilization success rate by 50%, relative to untreated controls using R 2.11© software. A general model fitting function for concentration/dose response models was applied using the package drc from R 2.11© software and the EC₅₀, EC₂₀ and EC₁₀ calculated. After transformation, data on fertilization success (%) were analyzed using one-way ANOVA and post hoc Tukey comparisons to detect any significant differences (p < 0.05) among the treatments. One-way ANOVA and post hoc Tukey comparisons were also used to determine significant differences in eggs per bundle and egg size among the different spawning times.

The fertilization rates in the untreated controls were relatively high for all the experiments, ranging from 86% to 96% (Fig. 1), with the exception of the last experiment performed on August 1st, 2008, where almost no fertilization (<10%) was observed in either control and experimental vials. ”

2.2 Hagedorn et al. (2015)

Hagedorn, Mary, Ann Farrell, Virginia Carter, Nikolas Zuchowicz, Erika Johnston, Jacqueline Padilla-Gamiño, Sarath Gunasekera, and Valerie Paul. 2015. “Effects of Toxic Compounds in Montipora Capitata on Exogenous and Endogenous Zooxanthellae Performance and Fertilization Success.” PLOS ONE 10 (2): e0118364. https://doi.org/10.1371/journal.pone.0118364.

“Fertilization success using untreated sperm was 79 ± 4% SEM, whereas the success rate dropped significantly after exposure to the crushed eggs, 1.3 ± 0% SEM. Unlike the eggs and the larvae, M. capitata sperm did not reduce the photosynthetic competency of P. compressa zooxanthellae, suggesting the sperm was nontoxic.”

2.3 Chille et al. (2022)

Chille, E. E., E. L. Strand, F. Scucchia, M. Neder, V. Schmidt, M. O. Sherman, T. Mass, and H. M. Putnam. 2022. “Energetics, but Not Development, Is Impacted in Coral Embryos Exposed to Ocean Acidification.” Journal of Experimental Biology 225 (19): jeb243187. https://doi.org/10.1242/jeb.243187.

morphology was assessed from egg to >4 cell stage at (4 hpf)

“All statistical analyses for morphological data were performed in RStudio (v1.3.959; https://www.rstudio.com/), using R version 4.0.2 (https://www-r-project-org.offcampus.lib.washington.edu/). A beta regression model (formula=proportion∼cleavage stage+treatment+cleavage stage:treatment) was used to analyze differences in the proportion of cells at each cleavage stage and treatment using the betareg R package (v3.1-4; Cribari-Neto and Zeileis, 2010). Differences between cleavage stages and treatments were then computed using the joint_tests function from the emmeans R package (v1.4.4; https://cran-r-project-org.offcampus.lib.washington.edu/package=emmeans), which effectively runs the beta regression model as a type III ANOVA. Additionally, a one-way nested ANOVA analysis tested the effect of tank on embryo and planula size wherein tank ID was nested within treatment. After determining that tank effects were non-significant (P>0.05), one-way ANOVA analysis was used to test for the effect of treatment on fertilized embryo, gastrula and planula volume. Post hoc Tukey HSD tests were conducted when the effect of treatment was significant (P<0.05). Data were visually examined for normal distribution and equal variance. The dependent variable was square-root transformed for the gastrulation and planula life stages in order to meet statistical assumptions prior to ANOVA analysis. Data points for these life stages were back-transformed for visualization in Fig. 2.”

3 Counts or Proportions?

Use count data for absolute occurrences when total possible events are not known or not relevant.
Use proportional data for relative comparisons or when expressing rates or probabilities against a known total

Count data is appropriate when measuring the actual number of occurrences (e.g., number of embryos observed, number of defects, visitors, or events). Proportional data should be used when expressing data as a ratio or fraction of a whole, typically representing the probability or relative frequency of an event (e.g., survival rate, percent of embryos surviving out of total).

3.1 When to Use Count Data

When you are interested in the absolute frequency of events, such as the number of embryos at each timepoint or the number of samples with a certain outcome.
When the key variable is inherently discrete, non-negative, and possibly skewed (e.g., counts often have a lot of zeros and/or high maximum values).
Appropriate statistical tests/models: Poisson regression, negative binomial regression, or zero-inflated models if there are many zeros.

3.2 When to Use Proportional Data

When you want to standardize results or compare groups differing in total size (e.g., expressing embryo survival as a proportion of initial embryos).
Useful for outcomes where each observation represents a part of a whole, and comparisons across different sample sizes are essential.
Appropriate statistical tests/models: Logistic regression, binomial models, or Z-tests for proportions.

4 Timing

5 Abnormalities

6 Survival

If you only have the number of embryos observed at each timepoint without knowing the initial total, you should analyze the apparent decline over time to estimate survival, using methods that compare counts between timepoints. In this case, the data represent cross-sectional snapshots of different samples at 4, 9, and 14 hours post-fertilization, each independent from one another. The appropriate analysis approach is to treat the observations at each timepoint as separate groups and compare them accordingly using methods for independent samples rather than longitudinal repeated measures.

6.1 Key Points for Independent Samples at Different Timepoints

Each timepoint group is treated as an independent sample.
Compare counts (or proportions if normalized) between groups using tests for independent samples like ANOVA or Kruskal-Wallis for continuous/ratio data or Chi-square for count/frequency data.
Avoid methods that explicitly model within-subject correlation since observations are independent across time points.

This means your analysis focuses on comparing independent groups’ counts or proportions over time, rather than modeling within-sample trajectories.Since your samples at each timepoint are independent (not repeated measures of the same samples), your data are better treated as independent cross-sectional groups rather than longitudinal repeated measures.

This means that instead of modeling within-sample changes over time, you compare the observed embryo counts or survival proportions between independent groups at 4, 9, and 14 hours using statistical tests appropriate for independent samples, such as ANOVA or Kruskal-Wallis for continuous data or Chi-square tests for counts/frequencies.

Thus, your analysis approach focuses on comparing independent groups rather than modeling longitudinal trajectories within subjects or samples. This aligns with the “pre-post no control group” design where different subjects are sampled at each timepoint.

6.2 Approach Overview

6.3 Structure Your Data

For each sample, you have:
- Time point (4, 9, or 14 hours post-fertilization)
- Observed viable embryo count at that time
You can summarize the data by calculating the mean and variance of embryo counts at each timepoint.

6.4 Comparing Embryo Counts Over Time

Analyze how the average count of viable embryos per sample changes across timepoints.
Calculate the proportionate change from one time point to the next:

6.5 Statistical Analysis

If samples are independent at each timepoint, use one-way ANOVA or nonparametric tests like Kruskal-Wallis.
To estimate “survival,” take the percentage decrease in embryo count between each timepoint as a proxy for mortality.

6.6 Visualization

Create a plot showing the mean embryo count per sample at each timepoint.
This will visualize survival (or mortality) dynamics for the groups over time.

6.7 Example Calculation

Let’s say you have the following average embryo counts per sample:

4 hours: mean = 18 embryos
9 hours: mean = 12 embryos
14 hours: mean = 7 embryos

Survival proportion to 9 hours: $12/18=0.67$(67%) Survival proportion to 14 hours: $7/18=0.39$(39%).

6.8 Important Notes

Without initial counts, you can only describe relative survival between timepoints, not absolute survival from fertilization.
Ensure consistency in sample handling to minimize bias.
Interpret results as changes in observed counts, acknowledging the limitation of not measuring initial embryo numbers.

This method will allow for assessment of relative survival dynamics and how embryo counts decline over time.

https://www.perplexity.ai/search/i-have-a-dataset-of-120-sample-ixpZ9EBOSPS8XkQYWp8FDQ#1

--- title: "Statistics for embryo scope data" subtitle: "Exploring data from embryonic development of Montipora capitata rice corals exposed to PVC leachate" author: "Sarah Tanja" date: 09/06/2024 date-format: long date-modified: today categories: [stats, coral] bibliography: ../coral-embryology.bib --- # Background What information can we get from looking at the *Montipora capitata* embryos that were exposed to varying levels of PVC leachate (High: 1mg/L, Mid: 0.1mg/L, Low: 0.01mg/L)? This experiment allowed embryos to develop from a bundle-bundle cross (meaning one egg-sperm bundle from one parent colony was placed in a 20mL scintillation vial with another egg-sperm bundle from a different parent colony). Colonies are not known to be able to self-fertilize... meaning it's important that each bundle came from a different parent colony. Each *Montipora capitata* egg sperm bundle contains a 'bundle' of eggs that are bound together around a 'packet' of sperm. ::: callout-note Each egg-sperm bundle contains 15 +/- 5.1 oocytes (mean +/- SD, n = 214, from 26 colonies) [@padilla-gamino2011] ::: This means that each bundle-bundle cross, if there was 100% fertilization success, could result in 18 - 42 maturing embryos. Question \# 1: Do we use 100% fertilization success as our 'baseline' to compare all treatment groups (including the control?). This could provide insight into fertilization success ratios for bundle-bundle cross experiments, but may not be applicable to oceanic conditions, where there is sperm competition, mixing, and greater potential for dilution Answer: We will only assess fertilization success at the point of cleavage, where presumably, we will still see the unfertilized oocytes (any eggs that have not cleaved). In the prawn chip and early gastrula phases, the unfertilized eggs have already dissolved. In these phases, we will assess embryo mortality based on the fertilization rates we find from the cleavage data. Question #2: How should we treat fragmented embryos? I can tell by the cell structure that they made it to a certain phase... but I can't tell how many fragments make up one whole embryo (count data). Should we count the fragmented embryos? Are they fragmented, or are they deformed (abnormalities?) | variable | unit/type | statistical test | |------------------------|------------------------|------------------------| | Developmental stage | cell stage, ordinal categorical variable | Chi-Square Goodness of Fit or Log-Likelihood Ratio | | proportion of abnormality per cross | ratio, continuous variable | Log-Likelihood Ratio | | proportion of abnormality per treatment | ratio, continuous variable | Log-Likelihood Ratio | | proportion of fertilization success | ratio, continuous variable | Log-Likelihood Ratio | For every embryo in every sample we have: - Embryo stage (cell stage, ordinal categorical variable) - Embryo status (typical/malformed, binary categorical variable) With this data we can assess for each sample: - Embryo counts (counts of embryos in each stage and status) - Embryo proportions (proportions of embryos in each stage and status) Across samples we want to analyze for: - Timing - Abnormality - Survival # Examples from the literature ## @hedouin2013 - looked at initial cleavage (3 hpf) - bundle-bundle cross in 10mL of 0.45um FSW > " After preservation, eggs were examined for the fertilization (showing normal cleavage) and the proportion of successful fertilization within treatment determined. Eggs showing no signs of cleavage were scored as unfertilized. > > The proportions of fertilized eggs were used to estimate EC~50~, the concentration of copper that reduced the fertilization success rate by 50%, relative to untreated controls using R 2.11© software. A general model fitting function for concentration/dose response models was applied using the package drc from R 2.11© software and the EC~50~, EC~20~ and EC~10~ calculated. After transformation, data on fertilization success (%) were analyzed using one-way ANOVA and post hoc Tukey comparisons to detect any significant differences (*p* \< 0.05) among the treatments. One-way ANOVA and post hoc Tukey comparisons were also used to determine significant differences in eggs per bundle and egg size among the different spawning times. > > The fertilization rates in the untreated controls were relatively high for all the experiments, ranging from 86% to 96% ([Fig. 1](#f0005)), with the exception of the last experiment performed on August 1st, 2008, where almost no fertilization (\<10%) was observed in either control and experimental vials. " ## @hagedorn2015 > "Fertilization success using untreated sperm was 79 ± 4% SEM, whereas the success rate dropped significantly after exposure to the crushed eggs, 1.3 ± 0% SEM. Unlike the eggs and the larvae, M. capitata sperm did not reduce the photosynthetic competency of P. compressa zooxanthellae, suggesting the sperm was nontoxic." ## @chille2022a - morphology was assessed from egg to \>4 cell stage at (4 hpf) > "All statistical analyses for morphological data were performed in RStudio (v1.3.959; <https://www.rstudio.com/>), using R version 4.0.2 (<https://www-r-project-org.offcampus.lib.washington.edu/>). A beta regression model (formula=proportion∼cleavage stage+treatment+cleavage stage:treatment) was used to analyze differences in the proportion of cells at each cleavage stage and treatment using the betareg R package (v3.1-4; Cribari-Neto and Zeileis, 2010). Differences between cleavage stages and treatments were then computed using the joint_tests function from the emmeans R package (v1.4.4; <https://cran-r-project-org.offcampus.lib.washington.edu/package=emmeans>), which effectively runs the beta regression model as a type III ANOVA. Additionally, a one-way nested ANOVA analysis tested the effect of tank on embryo and planula size wherein tank ID was nested within treatment. After determining that tank effects were non-significant (*P*\>0.05), one-way ANOVA analysis was used to test for the effect of treatment on fertilized embryo, gastrula and planula volume. *Post hoc* Tukey HSD tests were conducted when the effect of treatment was significant (*P*\<0.05). Data were visually examined for normal distribution and equal variance. The dependent variable was square-root transformed for the gastrulation and planula life stages in order to meet statistical assumptions prior to ANOVA analysis. Data points for these life stages were back-transformed for visualization in Fig. 2." > # Counts or Proportions? - Use **count data** for absolute occurrences when total possible events are not known or not relevant. - Use **proportional data** for relative comparisons or when expressing rates or probabilities against a known total **Count data** is appropriate when measuring the actual number of occurrences (e.g., number of embryos observed, number of defects, visitors, or events). **Proportional data** should be used when expressing data as a ratio or fraction of a whole, typically representing the probability or relative frequency of an event (e.g., survival rate, percent of embryos surviving out of total). ## When to Use Count Data - When you are interested in the absolute frequency of events, such as the number of embryos at each timepoint or the number of samples with a certain outcome. - When the key variable is inherently discrete, non-negative, and possibly skewed (e.g., counts often have a lot of zeros and/or high maximum values). - Appropriate statistical tests/models: Poisson regression, negative binomial regression, or zero-inflated models if there are many zeros. ## When to Use Proportional Data - When you want to standardize results or compare groups differing in total size (e.g., expressing embryo survival as a proportion of initial embryos). - Useful for outcomes where each observation represents a part of a whole, and comparisons across different sample sizes are essential. - Appropriate statistical tests/models: Logistic regression, binomial models, or Z-tests for proportions. # Timing # Abnormalities # Survival If you only have the **number of embryos observed at each timepoint** without knowing the initial total, you should analyze the apparent decline over time to estimate survival, using methods that compare counts between timepoints. In this case, the data represent **cross-sectional snapshots** of different samples at 4, 9, and 14 hours post-fertilization, each independent from one another. The appropriate analysis approach is to treat the observations at each timepoint as separate groups and compare them accordingly using methods for independent samples rather than longitudinal repeated measures. ## Key Points for Independent Samples at Different Timepoints - Each timepoint group is treated as an independent sample. - Compare counts (or proportions if normalized) between groups using tests for independent samples like ANOVA or Kruskal-Wallis for continuous/ratio data or Chi-square for count/frequency data. - Avoid methods that explicitly model within-subject correlation since observations are independent across time points. This means your analysis focuses on **comparing independent groups' counts or proportions over time**, rather than modeling within-sample trajectories.Since your samples at each timepoint are independent (not repeated measures of the same samples), your data are better treated as **independent cross-sectional groups** rather than longitudinal repeated measures. This means that instead of modeling within-sample changes over time, you compare the observed embryo counts or survival proportions between independent groups at 4, 9, and 14 hours using statistical tests appropriate for independent samples, such as ANOVA or Kruskal-Wallis for continuous data or Chi-square tests for counts/frequencies. Thus, your analysis approach focuses on comparing independent groups rather than modeling longitudinal trajectories within subjects or samples. This aligns with the "pre-post no control group" design where different subjects are sampled at each timepoint. ## Approach Overview ## Structure Your Data - For each sample, you have: - Time point (4, 9, or 14 hours post-fertilization) - Observed viable embryo count at that time - **You can summarize the data by calculating the mean and variance of embryo counts at each timepoint.** ## Comparing Embryo Counts Over Time - Analyze how the **average count of viable embryos per sample** changes across timepoints. - Calculate the proportionate change from one time point to the next: ## Statistical Analysis - If samples are independent at each timepoint, use one-way ANOVA or nonparametric tests like Kruskal-Wallis. - To estimate "survival," take the percentage decrease in embryo count between each timepoint as a proxy for mortality. ## Visualization - Create a plot showing the **mean embryo count per sample at each timepoint**. - This will visualize survival (or mortality) dynamics for the groups over time. ## Example Calculation Let's say you have the following average embryo counts per sample: - 4 hours: mean = 18 embryos - 9 hours: mean = 12 embryos - 14 hours: mean = 7 embryos **Survival proportion to 9 hours:** $12/18=0.67$(67%) **Survival proportion to 14 hours:** $7/18=0.39$(39%). ## Important Notes - Without initial counts, you can only describe relative survival between timepoints, not absolute survival from fertilization. - Ensure consistency in sample handling to minimize bias. - Interpret results as changes in observed counts, acknowledging the limitation of not measuring initial embryo numbers. This method will allow for assessment of **relative survival dynamics** and how embryo counts decline over time. <https://www.perplexity.ai/search/i-have-a-dataset-of-120-sample-ixpZ9EBOSPS8XkQYWp8FDQ#1>