Week 05 — Performance and Potentiality of Camelina (Camelina sativa L. Crantz) Genotypes in Response to Sowing Date under Mediterranean Environment

1 Week Overview

Session	Content
Session A Part 1	Student-led paper discussion (~35 min)
Session A Part 2	Method deep-dive (this document, ~35 min)
Session B	Hands-on coding workshop

Learning Objectives

Understand why split-plot designs exist and how they differ from RCBD
Recognize that lmer + anova() is a linear mixed model, not classical ANOVA
Identify whole-plot vs subplot error strata and why they change the F-ratio
Interpret G × ST interactions biologically and agronomically
Apply the Zuur data exploration checklist critically to a published paper

1.1 Background & Readings

Paper:

Angelini, L. G., Chehade, L. A., Foschi, L., & Tavarini, S. (2020). Performance and Potentiality of Camelina (Camelina sativa L. Crantz) Genotypes in Response to Sowing Date under Mediterranean Environment. Agronomy, 10(12), 1929. https://doi.org/10.3390/agronomy10121929

Questions to Consider

While reading, always think about:

What is the research question**? What factors are being tested?
What is the experimental design**? (How many factors? How are treatments arranged? What is the blocking factor?)
What are the response variables**? What type of data are they? (continuous, counts, proportions?)
What statistical methods are used? Are they appropriate given the design?
Does the Method-Question-Data Triangle align in this paper?
What would Zuur (Week 1) say about their data exploration? Do you see evidence of it?

1.2 Response variables

Variable	Type
Seed yield (Mg/ha)	Continuous
Oil content (% DW)	Continuous
1000-seed weight (g)	Continuous
Plant height (cm)	Continuous
Siliques per plant	Count
Plant density (No./m²)	Count
Harvest index	Index

Note

Counts were analyzed with Poisson GLMs: potentially the right call (if they checked for zero-inflation). We return to this below.

2 The Experimental Design

2.1 Why It Is a Split-Plot

You cannot randomize Autumn vs. Spring sowing within the same small plot — the whole field area for a given replicate must be sown in the same season. Sowing time requires a large experimental unit. This is the defining reason split-plot designs exist.

BLOCK (replicate, n = 4)
├── WHOLE PLOT: Autumn sowing
│   ├── subplot: V1, V2, V3, V4, V5, V6, CELINE (randomized)
└── WHOLE PLOT: Spring sowing
    └── subplot: V1, V2, V3, V4, V5, V6, CELINE (randomized)

2.2 Two Error Strata

Stratum	Tests	Experimental unit
Whole-plot error	Sowing Time (ST)	Whole plot within block
Subplot error	Genotype (G), G×ST	Subplot within whole plot

Using a single pooled error (naive ANOVA) gives an anti-conservative test for ST (false positives) and an over-conservative test for G (false negatives).

2.3 How to run as an ANOVA (not what they did, even though they claim that)

aov(yield ~ ST * Genotype * Year + Error(Block/ST), data = dat)

2.4 What They Likely Ran

model <- lmer(yield ~ Genotype * ST * Year + (1|Block) + (1|Block:ST),
              data = dat)
anova(model)  # Satterthwaite F-tests

lmer + anova() ≠ Classical ANOVA

This is a linear mixed model. The anova() call uses Satterthwaite approximated denominator df - which may be non-integer. Reporting this as “ANOVA” obscures what was done.

2.5 What a Significant G × ST Interaction Means

The ranking of genotypes depends on sowing time. You cannot make one variety recommendation.

2.6 Ordinal vs. Disordinal

Type	Definition	Implication
Ordinal	Rankings preserved, magnitudes differ	Main effects broadly valid
Disordinal	Rankings cross over	Separate recommendations essential

2.7 Tukey vs. LSD

	LSD	Tukey HSD
Controls	Per-comparison alpha	Familywise error rate
Result	More liberal	More conservative
Used in paper		✅

Warning

Compact letter displays (“a”, “ab”, “b”) can mask near-significant differences. Always look at the actual means and CIs.

2.8 The Method–Question–Data Triangle

       QUESTION
   "Best genotype × 
   sowing time?"
        /\
       /  \
 DATA /    \ METHOD
-----/------\-------
Factorial    Split-plot LMM
blocked      Two error strata
cont+count   Poisson for counts

Alignment? Mostly yes — but the gap between what was done (LMM) and what was reported (“ANOVA”) is a transparency issue.

2.9 The 6 Key Takeaways

Split-plot because ST cannot be randomized at subplot level → two error strata
lmer + anova() is a linear mixed model, not classical ANOVA, this is a reporting issue
Year as fixed is pragmatic with n=2 but limits generalizability
G×ST is disordinal separate recommendations required
Poisson for counts was correct if they checked for zero-inflation (not reported)
Data exploration is underreported it usually is

2.10 Data analysis

You can generate the data using this, and then analyze it using either the aov() or lmer() approach. Check for differences

# I will simulate the data. 4 repliacation (Block), 2 sowing times (ST), 7 Genotypes (G), 2 years (Year)
set.seed(42)
dat<-expand.grid(Block = factor(1:4),
                 ST = factor(c("Autumn", "Spring")),
                 Genotype = factor(paste0("V", 1:6)),
                 Year = factor(2018:2019))
dat$yield <- 1.8 +
   0.30 * (dat$ST == "Autumn") +
   c(0.10, -0.10, 0.40, -0.20, 0.05, 0.10, 0.20)[as.integer(dat$G)] +
   0.40 * (dat$ST == "Autumn" & dat$G == "V3") + 
   -0.05 * (dat$ST == "Spring" & dat$G == "V5") + 
   0.20 * (dat$Year == "2019") +
   rnorm(nrow(dat), 0, 0.15)  # add some noise