TLDR
There’s a widespread belief online that it is objectively better to create one’s own cards rather than use pre-made
The rationale is that self-made cards will ultimately come with better retention
But not only is research on this mixed, it can actually be poor advice if one does not know how to create good, retainable cards
The study that is most widely-cited supporting “self-made better” has some important limitations that make it difficult to draw any meaningful conclusions
(Strangely I can’t recall or find anywhere that has actually looked into this paper despite it’s frequent citation)
There are pros and cons to both self-made and pre-made, but ultimately the most important factors are that the cards are sufficient, high-quality and relevant - regardless of how they came to be
High-quality pre-made cards can actually be uniquely beneficial to ensure one understands the material rather than just the (self-made) card
(In an interview, an exam or a conversation with a native it likely won’t be worded the same way one has worded it in a perfectly-worded self-made card)
Just because a card is self-made does not mean it’s inherently good or better, and vice-versa for pre-made
Shaeda can (and should) be used for both.
Intro
It’s often asserted in online discussions that learning via one’s own flashcards is better than using pre-made ones. Whilst it’s not necessarily wrong per se, I suspect that the majority of individuals making this blanket recommendation have simply read it from someone else online and carried on the domino-chain rather than thinking any further about it - which can happen.1
The rationale is actually perfectly fine: self-made cards can come with greater retention (for a mix of reasons). This is good. In fact, it’s very good. After all, strip ‘studying’ or ‘learning’ down and all it ultimately is is gradually increasing the amount of stuff we can (easily) remember. Retention is absolutely front-and-center2.
However, whilst I don’t entirely disagree with the advice nor the rationale, it does require I think some context and nuance which I’ll try to speak to further down. But before that, I want to take look at the widely-cited paper supporting self-made over pre-made.
Research
Open access: User-Generated Digital Flashcards Yield Better Learning Than Pre-Made Flashcards

Abstract & Methods
I’ll try to summarise the paper and their methods as briefly as possible, then provide some screenshots of relevant sections underneath:
Undergraduate students read 1 page’s worth of material on a random topic for 5 minutes and were told the 10 key terms they had to learn
After reading for 5 minutes, each student then studied this material using either pre-made or self-made cards for 20 minutes
After a 5-minute break, each student then read another page of a different random topic and again studied for 20 minutes, but this time using the opposite card method (e.g., pre-made cards topic 1 —> self-made cards topic 2, and vice versa)
After this second 20-minute block, all studying was finished
Two days later, the students sat a multiple-choice online test covering both topics
Averaging across both tests, the self-made group performed better on the quiz (~68% vs. ~60%, on average)

Limitations?
So firstly, as is probably apparent to everyone reading, the setup was not really reflective of any ‘real-world ’ use:
Reading 1 page of a random topic for 5 minutes
A single 20 minute session (consisting of optional flipping without grading)
Then reading a different random topic and studying this for 20 minutes
2 days later, sitting a multiple choice quiz
You can compare this to, say, learning a new language, or studying for a degree, where one has to learn several textbooks’ worth of information over 3-10 years and it’s probably not quite the same.
Card Structure
Fixing the order of cards, preventing ‘hiding’ cards, preventing extra study of certain cards and having optional checking to see the answer defeats a large (in fact, almost the entire) point to using flashcards.
Ultra-Short Timespan and Content Length
A 5 minute read of a 1 page on a random topic, 20 minutes of interaction with the cards for one session, then a test 2 days later.
There seems to be quite a large misunderstanding or misappreciation of SRS and how it works. It needs time. All learning needs time. If it didn’t, school wouldn’t take us 20 years. Of course it is not feasible to expect researchers to be able to commit to such time-spans due to costs and priorities etc, but it’s important to be aware of just how short this was.
No Grading
Below is a video of the software used for the flashcards. Not having any form of grading seems very strange. Part of the reason to using flashcards is you are easily able to identify and naturally focus on the content you are struggling to learn, and deprioritise that which you aren’t. Mindlessly flipping-and-repeating for 20 minutes is probably not going to be doing much - if anything. But then again, it was just a single session on a random topic for 20 minutes. Even with grading it would have had only a marginal improvement.
Low-Quality Pre-Made Cards
This gets to what I mentioned at the start. If the card is low-quality, it doesn’t matter if it’s self-made or pre-made. From the video above (and screenshots below), we can see what the pre-made cards looked like. These are objectively poor quality:
No way to see the question and answer at the same time
Far too verbose on the back
No real context or question is being asked on the front making it harder to apply later on
Strangely, these appear to have been deemed high-quality by those involved.
Mixed Findings From Other Studies
The authors provide a table summarising other studies on this, and we can see that it’s actually consistently mixed. Even if we only look at the studies that were of a longer (but still very short) timespan (highlighted in red), we can see it’s still mixed.
I’m going to assume that the limitations present in this study were probably also present in these studies, too. That would make the scattering of results logical: it’s probably just noisy research due to study design.
No Control (Minor)
I think it would have been very valuable to have had a control topic whereby the students read a page on another random topic but did not use flashcards and instead used only the (sadly) most common forms of studying such as rereading and note-taking etc. I suspect that the performance on this topic’s quiz would have been very poor - in which case the framing of the study may have shifted somewhat.
Logical Flaws (?)
I could simply be misunderstanding here, but it seems there may have been two potential logic flaws? Below, the authors write that they used a 48-hour delayed test specifically to avoid measuring suboptimal learning techniques such as cramming (whereby the entire deck of cards is studied repeatedly in a single session, typically before an exam)
But as we’ve seen, the setup used was virtually textbook cramming: looping through the same cards continuously for 20 minutes. It seems like they are relying entirely on the two-day delayed test to claim they are measuring durable, long-term learning, even though the study session itself was a one-off 20-minute cramming block (?). Simply delaying the test itself does not render the study session as distributed practice.
Further, it is widely known (and stated) that students benefit more when there is (immediate) correct answer feedback, as is able to occur when cards are flipped over to check for answers - yet as we’ve seen already, this was made optional in the present study?
Pre-Made versus LLM-Made?
I’m going to show a real example of a paid-for (suggesting it should be of higher quality) pre-made card below. It is very low-quality.
Importantly, not necessarily the content on the card, but merely the amount of content and the formatting thereof. This would not at all be very productive for whomever is studying it. It contains 57 words on the back. This is not a cherry-picked example.
But now, if we simply prompt an appropriate LLM with some very basic, effective-flashcard principles, we get the following cards (plural). The plural here is important, as one of the principles to flashcards is keeping them as atomic and as blunt as reasonably possible. That means multiple cards should be used for the same concept.
These are all much higher quality.
Which gets to the main point of this section: LLM-made cards technically are pre-made, but they can actually be customised far more easily than typical pre-made cards and thus made to be very high-quality trivially easily. I suspect that a lot of the reason pre-made has a bad reputation is due to very basic principles not being followed when the cards were created, which ironically then probably fed in to students also creating poor-quality cards. Interestingly, we actually see this being alluded to in wider research whereby teachers are not able to answer student-created flashcards (Dodigovic., 2013).



Language Learning versus Academic Learning
Another key piece of context is what is being learned. If one is studying an academic subject, making your own cards whilst reading a textbook is fairly simple: read, synthesise (whilst trying to avoid overfitting), and type.
However, if you are learning a language, the advice to only “create your own cards” carries a higher risk of actually potentially becoming detrimental.
The reason is essentially one of friction and/or time. When learning a new language, we often cannot just type the Target Language (TL) from our keyboard. We may be able to for some words if it shares the same script, but often times we will not - or at least not easily. Typically one has to constantly stop and copy-paste the native script from elsewhere and then maybe try to find a sentence example, and copy-paste that too. But even if one does manage that, of course an aspect that’s unique to language learning is the necessity of audio. Manually downloading audio files or setting up third-party TTS add-ons for self-made cards is incredibly tedious - in fact many simply put up with not actually hearing the language they’re learning for this very reason.
And even if you do manage to configure audio, you don’t actually have any easy control over the voice or playback speed etc. What ends up happening is that many beginners will simply give up when faced with this initial friction. They’ll either abandon flashcards entirely and jump to a gamified pseudo-learning app, or they’ll try to persevere and download a pre-made deck online only to find out that the cards are not level-appropriate, contain irrelevant cards they wish not to learn yet (for whatever reason), and contain missing audio. Or incredibly loud audio. Or far too quiet audio. Or missing audio. Or audio that you’re not able to playback easily as there’s no ‘play’ button included. The list goes on.
Luckily, Shaeda solves all of these worries.

Summing up
With my own studying over the years, both casual and formal, my self-made cards typically come with about ~10-20% better relative retention per-card. That’s good. The downside, however, is that these self-made decks are around ~5x smaller due to the massive time investment of creating them. For certain cards it was essential, which is fine, but for a lot, it was probably not. I would have been better using Shaeda had it existed.
The trade-off essentially comes down to what I’m very casually and non-precisely terming relative knowledge and absolute knowledge (better and more precise terms exist3):
Relative Knowledge (RK): how easily one can recall each piece of content learned, regardless of volume.
Absolute Knowledge (AK): the total volume of content that has been learned.
When we first begin learning, say, a new language or a new subject, both domains are low - potentially even zero. As one begins to learn, RK increases faster (in percentage terms) than AK. After a few days of learning, say, Chinese, I may have studied 10 words and successfully been able to recall 8 when tested by a teacher (or on Shaeda’s diagnostic test). That’s an 80% RK. Unfortunately for me it takes around ~5,000 words to become fluent, so my AK of 8 is just 0.16% - not quite these guys just yet. Consider the flip-side of, say, a retired professor: an incredibly high AK, but their RK has likely dropped.
If one is aiming to learn several textbooks’ worth of content over many years, it would probably be somewhat strange to celebrate, say, a 90% recall of book 1, chapter 1 after a year of study over, say, an 80% recall of 3 entire books. Here, RK has been maximised at the expense of AK. There’s many instances where this will be fine, correct even, but given the initial goal was to learn several textbooks, it is probably detrimental.
The concern and/or risk of pre-made is that the cards may be low quality - and one would not know until they view them for the first time. But this is not unique to pre-made. It can apply just as much to self-made, which we’ve already seen (Dogovic., 2013). After all, every pre-made deck was self-made to someone, somewhere.
Ultimately, all-else-equal, the act of creating one’s own cards will likely be contributing to the learning effect - albeit very slightly. That part is true. But whether that means that one should use and only use their own self-made cards is another question entirely.
Takeaways:
Quality matters much more than authorship
The frequently-cited paper supporting “self-made always” has many limitations that are seemingly not widely appreciated
Language learning or Academic learning is important context
Leveraging LLMs correctly can be incredibly (and uniquely) beneficial
Use a lot of both, just make sure they’re good4
PS: A lot of users who have signed up for Shaeda have all these posts go to their spam - this isn’t ideal. If you found this post insightful, please consider liking and sharing. Substack say that this will help emails be directed to inboxes instead. Thank you!
Interestingly, this is somewhat debated. I don’t know why. For further reading on why retention/memory is #1, Justin Skycak has a good, short blog.
In CogSci/Education research there are terms such as Storage Strength and Retrieval Strength, or Depth of Knowledge and Breadth of Knowledge, which capture similar ideas.
Blunt, Relevant, Atomic, Accurate





















