Teaching Logical Fallacies with AI Gems and Prompted Agents

Home/Theory/Teaching Logical Fallacies with AI Gems and Prompted Agents

Theory article

Teaching Logical Fallacies with AI Gems and Prompted Agents

In the age of AI, one of the best ways to teach logical fallacies is not to have students passively ask a chatbot for answers, but to have them collaboratively design a Gemini Gem or other pre-prompted agent that must identify, score, explain, rebut, and repair bad reasoning. When students build the analytical frame together, they make their own standards explicit and become much better at seeing both the power and the limits of AI diagnosis.

What this article is for

The goal is to help teachers turn AI from a shortcut into a visible thinking tool. The class uses the model as a provisional analyst whose output must be judged, revised, and improved by students rather than simply accepted.

What this method avoids

It avoids the shallow classroom pattern where students paste in a passage, collect labels, and call that critical thinking. The human work here lies in building the prompt, auditing the output, tightening the distinctions, and deciding when the model is overreaching.

Why collaborative agent-building works

The learning payoff comes from making the criteria public and revisable.

Students externalize their standards

The moment students try to write the Gem's instructions, they are forced to decide what counts as good evidence for a fallacy label, what counts as a false positive, and how much quotation is needed to make the diagnosis fair.

Comparison becomes unavoidable

A strong agent prompt has to say how to choose among near neighbors. That means students must clarify the differences among Straw man, Red herring, Ad hominem, False equivalence, and other often-confused entries rather than relying on vague familiarity.

The model's mistakes become teachable moments

When the agent overlabels, misses a caveat, or confuses two fallacies, the class has concrete material to debug. The failure is no longer hidden inside a teacher lecture; it becomes a visible reasoning event the whole room can inspect.

Students practice rebuttal and repair

A good agent does not stop with naming. It also explains the misstep, answers it in plain language, and suggests a stronger rewrite. That keeps the exercise pointed toward better reasoning rather than mere accusation.

A strong classroom process

This works as a recurring exercise inside a critical thinking course.

Best general sequence: choose a text → build the prompt → run the agent → audit the output → revise the prompt → score and respond.

1. Choose a manageable source

Use a short editorial, op-ed paragraph, debate exchange, or opening statement. The text should be rich enough to contain real argumentative moves, but short enough that the class can still inspect each quoted line carefully.

2. Build the first prompt as a group

Draft the Gem's role, output format, and scoring rules on the board. Ask students what the agent must be prevented from doing, especially in cases where weak or ambiguous reasoning might tempt it into overdiagnosis.

3. Run the agent on the same text

Have the class watch one common output rather than scattering immediately into private runs. That shared output gives everyone the same object to critique.

4. Audit every finding

For each fallacy the agent flags, ask: is the quote sufficient, is the label the most precise one, what nearby labels should be ruled out, and what caveat should be stated before we accept the diagnosis?

5. Revise the prompt

Tighten the instructions in response to the model's errors. Over time, the class learns that prompt-writing is really criteria-writing in disguise.

6. End with human judgment

The final class product should not be "what the model said." It should be a human-vetted set of fallacy diagnoses, scores, rebuttals, and repairs.

What the agent should be asked to do

A useful Gem has a narrow mission and a disciplined output.

Identify only the strongest candidates

The agent should not carpet-bomb a passage with labels. Ask it to identify only the clearest two or three fallacies unless the user specifically requests a full sweep.

Quote enough to make the diagnosis visible

Require a short, sufficient quotation for each finding. This keeps the agent from floating free of the text and makes the reasoning misstep inspectable.

Explain the dynamics, not just the label

A good answer says exactly how the passage moves from evidence to conclusion, where the drift happens, and why that move is too fast, too broad, too selective, or too distracted.

Score and respond

The agent should assign clear scores and then do something constructive: rebut the misstep in plain language, or repair the claim so it says only what the argument has earned.

Prompt components that produce organized, salient output

The best prompts are modular: role, method, format, scoring, and response rules.

1. Role and stance

Tell the model to act like a careful classroom analyst rather than a prosecuting attorney. This single component sharply reduces overlabeling and rhetorical heat.

You are a careful logical-fallacy analyst for a critical thinking class.
Your job is not to hunt for labels aggressively, but to identify only the most justified fallacies in the passage.
Prefer specificity over broad accusation.
If a suspected fallacy is weak, borderline, or plausibly explained another way, say so clearly.
Do not moralize. Diagnose the reasoning, quote the relevant wording, explain the dynamics, and propose a fairer response or repair.

2. Output schema

Strong outputs come from strong formatting constraints. If you want organization, specify headings, quote rules, score labels, and link fields directly.

Format every answer with this structure:

◉ Source Summary
➘ One or two sentences summarizing the article, speech, or debate passage.

◉ Fallacy Findings
For each fallacy found, use this structure:
➘ Fallacy Name: [most specific label]
➘ Confidence Score (0-4): [number]
➘ Distortion Score (0-4): [number]
➘ Salient Quote: "[quote only the key lines needed to see the misstep]"
➘ Why It Fits: [3-5 sentences explaining the reasoning failure]
➘ Caveat: [state what would make this label too strong or misapplied]
➘ LogFall Link: https://logfall.com/fallacies/[slug]/
➘ Response: [answer the misstep in plain language]
➘ Repair: [rewrite the claim in a stronger, fairer form]

◉ Overall Pattern
➘ Briefly describe what kind of reasoning drift the piece shows overall.

3. Scoring rules

Separate confidence from importance. A model may be very confident that a minor fallacy is present, or only moderately confident that a major one shapes the passage.

Scoring rubric:
0 = not present or too weak to justify
1 = faint or merely possible
2 = present but modest
3 = strong and clear
4 = central and unmistakable

Use two separate scores:
➘ Confidence Score: how sure you are that the label fits
➘ Distortion Score: how much that fallacy affects the passage's overall reasoning

4. Passage-handling rules

The prompt should say how many fallacies to report, how much to quote, and when to withhold a label. Those guardrails often matter more than the fancy wording at the top.

Passage handling rules:
1. Quote only the lines needed to understand the fallacy.
2. Identify no more than the 3 strongest fallacies unless the user requests a fuller sweep.
3. Compare close alternatives before settling on the final label.
4. Prefer one precise label over several overlapping ones.
5. When no clear fallacy appears, say that directly instead of forcing a diagnosis.

A useful scoring model

Keep the numbers simple enough for students to compare and debate.

Confidence score

This score answers: how well does the passage justify the label? Students can argue over whether the quotation really supports the diagnosis or whether the evidence is too thin.

Distortion score

This score answers: how much does that fallacy matter to the overall argument? Some fallacies are present but peripheral; others shape the whole reasoning structure.

Optional class extension

If you want a richer tool, add one more score for repairability: how easy is it to salvage the argument without giving up its core point?

Keep the human override explicit

The class should always be free to lower a score, reject a label, or replace the model's chosen fallacy with a better one. The point is calibration, not obedience.

What to grade in the classroom

Grade the students' reasoning about the AI, not merely the AI's output.

Prompt quality

Did the group write clear instructions, good output constraints, and fair caveat rules? A vague prompt usually reveals a vague understanding.

Audit quality

Did students catch false positives, weak quotations, and sloppy category choices? The audit is often more pedagogically valuable than the first run itself.

Rebuttal and repair quality

Did the group answer the reasoning mistake clearly and then offer a stronger formulation? This is where the exercise becomes constructive rather than merely classificatory.

Reflective accuracy

Can students say where the agent helped, where it overreached, and what that reveals about both AI and fallacy instruction? That meta-level reflection is part of the lesson.

Risks and safeguards

AI can sharpen the unit, but only if the classroom remains intellectually in charge.

Do not let the model become the authority

A Gem is a scaffold, not an oracle. Students should always be asked to justify the final judgment independently of the model's wording.

Do not reward overdiagnosis

If students think more labels mean a better answer, the exercise will quickly turn into fallacy inflation. Reward precision, restraint, and clean comparison instead.

Do not separate fallacies from the wider toolkit

Some problems are really about bad statistics, weak causal design, or cognitive bias rather than classic fallacy forms. The agent should be taught to say that when needed.

Do not hide the prompt from students

The prompt is the curriculum in compressed form. When students can inspect and revise it, they are learning the criteria themselves rather than merely consuming an answer.

Takeaway

Treat the Gem as a collaboratively built reasoning instrument, not as a replacement for judgment.

The strongest classroom use of AI is not passive extraction but collaborative calibration. Students learn logical fallacies more deeply when they must tell the agent what to look for, how to quote, how to compare close labels, how to score responsibly, and how to answer a fallacy in plain language once it is found.

References and further reading

Sources that ground the article or push the discussion further.

Tips for Creating Custom Gems (Gemini Apps Help)

Tips for Creating Custom Gems (Gemini Apps Help) — Google's own guidance on writing clearer, more detailed Gem instructions.

Use Gems in Gemini Apps (Gemini Apps Help)

Use Gems in Gemini Apps (Gemini Apps Help) — Overview of how Gems operate as repeatable custom instruction agents.

Logical Fallacy Detection (Findings of ACL EMNLP 2022)

Logical Fallacy Detection (Findings of ACL EMNLP 2022) — Foundational NLP paper on logical fallacy detection as a model task.

Robust and Explainable Identification of Logical Fallacies in Natural Language Arguments (arXiv, 2022)

Robust and Explainable Identification of Logical Fallacies in Natural Language Arguments (arXiv, 2022) — Useful on explainable, staged fallacy identification in natural-language arguments.