Home/Theory/Teaching Logical Fallacies with AI Gems and Prompted Agents
Theory article
Teaching Logical Fallacies with AI Gems and Prompted Agents
In the age of AI, one of the best ways to teach logical fallacies is not to have students passively ask a chatbot for answers, but to have them
collaboratively design a Gemini Gem or other pre-prompted agent that must identify, score, explain, rebut, and repair bad reasoning. When students
build the analytical frame together, they make their own standards explicit and become much better at seeing both the power and the limits of AI diagnosis.
What this article is for
The goal is to help teachers turn AI from a shortcut into a visible thinking tool. The class uses the model as a provisional analyst whose
output must be judged, revised, and improved by students rather than simply accepted.
What this method avoids
It avoids the shallow classroom pattern where students paste in a passage, collect labels, and call that critical thinking. The human work here
lies in building the prompt, auditing the output, tightening the distinctions, and deciding when the model is overreaching.
Students externalize their standards
The moment students try to write the Gem's instructions, they are forced to decide what counts as good evidence for a fallacy label, what counts
as a false positive, and how much quotation is needed to make the diagnosis fair.
Comparison becomes unavoidable
A strong agent prompt has to say how to choose among near neighbors. That means students must clarify the differences among Straw man,
Red herring, Ad hominem, False equivalence, and other often-confused entries rather than relying on vague familiarity.
The model's mistakes become teachable moments
When the agent overlabels, misses a caveat, or confuses two fallacies, the class has concrete material to debug. The failure is no longer hidden
inside a teacher lecture; it becomes a visible reasoning event the whole room can inspect.
Students practice rebuttal and repair
A good agent does not stop with naming. It also explains the misstep, answers it in plain language, and suggests a stronger rewrite. That keeps
the exercise pointed toward better reasoning rather than mere accusation.
Best general sequence: choose a text → build the prompt → run the agent → audit the output → revise the prompt → score and respond.
1. Choose a manageable source
Use a short editorial, op-ed paragraph, debate exchange, or opening statement. The text should be rich enough to contain real argumentative moves,
but short enough that the class can still inspect each quoted line carefully.
2. Build the first prompt as a group
Draft the Gem's role, output format, and scoring rules on the board. Ask students what the agent must be prevented from doing, especially in cases
where weak or ambiguous reasoning might tempt it into overdiagnosis.
3. Run the agent on the same text
Have the class watch one common output rather than scattering immediately into private runs. That shared output gives everyone the same object to critique.
4. Audit every finding
For each fallacy the agent flags, ask: is the quote sufficient, is the label the most precise one, what nearby labels should be ruled out, and what caveat
should be stated before we accept the diagnosis?
5. Revise the prompt
Tighten the instructions in response to the model's errors. Over time, the class learns that prompt-writing is really criteria-writing in disguise.
6. End with human judgment
The final class product should not be "what the model said." It should be a human-vetted set of fallacy diagnoses, scores, rebuttals, and repairs.
Identify only the strongest candidates
The agent should not carpet-bomb a passage with labels. Ask it to identify only the clearest two or three fallacies unless the user specifically requests a full sweep.
Quote enough to make the diagnosis visible
Require a short, sufficient quotation for each finding. This keeps the agent from floating free of the text and makes the reasoning misstep inspectable.
Explain the dynamics, not just the label
A good answer says exactly how the passage moves from evidence to conclusion, where the drift happens, and why that move is too fast, too broad, too selective, or too distracted.
Score and respond
The agent should assign clear scores and then do something constructive: rebut the misstep in plain language, or repair the claim so it says only what the argument has earned.
1. Role and stance
Tell the model to act like a careful classroom analyst rather than a prosecuting attorney. This single component sharply reduces overlabeling and rhetorical heat.
You are a careful logical-fallacy analyst for a critical thinking class.
Your job is not to hunt for labels aggressively, but to identify only the most justified fallacies in the passage.
Prefer specificity over broad accusation.
If a suspected fallacy is weak, borderline, or plausibly explained another way, say so clearly.
Do not moralize. Diagnose the reasoning, quote the relevant wording, explain the dynamics, and propose a fairer response or repair.
2. Output schema
Strong outputs come from strong formatting constraints. If you want organization, specify headings, quote rules, score labels, and link fields directly.
Format every answer with this structure:
◉ Source Summary
➘ One or two sentences summarizing the article, speech, or debate passage.
◉ Fallacy Findings
For each fallacy found, use this structure:
➘ Fallacy Name: [most specific label]
➘ Confidence Score (0-4): [number]
➘ Distortion Score (0-4): [number]
➘ Salient Quote: "[quote only the key lines needed to see the misstep]"
➘ Why It Fits: [3-5 sentences explaining the reasoning failure]
➘ Caveat: [state what would make this label too strong or misapplied]
➘ LogFall Link: https://logfall.com/fallacies/[slug]/
➘ Response: [answer the misstep in plain language]
➘ Repair: [rewrite the claim in a stronger, fairer form]
◉ Overall Pattern
➘ Briefly describe what kind of reasoning drift the piece shows overall.
3. Scoring rules
Separate confidence from importance. A model may be very confident that a minor fallacy is present, or only moderately confident that a major one shapes the passage.
Scoring rubric:
0 = not present or too weak to justify
1 = faint or merely possible
2 = present but modest
3 = strong and clear
4 = central and unmistakable
Use two separate scores:
➘ Confidence Score: how sure you are that the label fits
➘ Distortion Score: how much that fallacy affects the passage's overall reasoning
4. Passage-handling rules
The prompt should say how many fallacies to report, how much to quote, and when to withhold a label. Those guardrails often matter more than the fancy wording at the top.
Passage handling rules:
1. Quote only the lines needed to understand the fallacy.
2. Identify no more than the 3 strongest fallacies unless the user requests a fuller sweep.
3. Compare close alternatives before settling on the final label.
4. Prefer one precise label over several overlapping ones.
5. When no clear fallacy appears, say that directly instead of forcing a diagnosis.
Confidence score
This score answers: how well does the passage justify the label? Students can argue over whether the quotation really supports the diagnosis or whether the evidence is too thin.
Distortion score
This score answers: how much does that fallacy matter to the overall argument? Some fallacies are present but peripheral; others shape the whole reasoning structure.
Optional class extension
If you want a richer tool, add one more score for repairability: how easy is it to salvage the argument without giving up its core point?
Keep the human override explicit
The class should always be free to lower a score, reject a label, or replace the model's chosen fallacy with a better one. The point is calibration, not obedience.
Prompt quality
Did the group write clear instructions, good output constraints, and fair caveat rules? A vague prompt usually reveals a vague understanding.
Audit quality
Did students catch false positives, weak quotations, and sloppy category choices? The audit is often more pedagogically valuable than the first run itself.
Rebuttal and repair quality
Did the group answer the reasoning mistake clearly and then offer a stronger formulation? This is where the exercise becomes constructive rather than merely classificatory.
Reflective accuracy
Can students say where the agent helped, where it overreached, and what that reveals about both AI and fallacy instruction? That meta-level reflection is part of the lesson.
Do not let the model become the authority
A Gem is a scaffold, not an oracle. Students should always be asked to justify the final judgment independently of the model's wording.
Do not reward overdiagnosis
If students think more labels mean a better answer, the exercise will quickly turn into fallacy inflation. Reward precision, restraint, and clean comparison instead.
Do not separate fallacies from the wider toolkit
Some problems are really about bad statistics, weak causal design, or cognitive bias rather than classic fallacy forms. The agent should be taught to say that when needed.
Do not hide the prompt from students
The prompt is the curriculum in compressed form. When students can inspect and revise it, they are learning the criteria themselves rather than merely consuming an answer.
Takeaway
Treat the Gem as a collaboratively built reasoning instrument, not as a replacement for judgment.
The strongest classroom use of AI is not passive extraction but collaborative calibration. Students learn logical fallacies more deeply when they must tell
the agent what to look for, how to quote, how to compare close labels, how to score responsibly, and how to answer a fallacy in plain language once it is found.