Effects of Pedagogical Questioning on Singaporean Young Children’s Learning of Novel Categories

There has been a longstanding debate about the advantages and disadvantages of two polarities of teaching methods: direct instruction and discovery learning. Research has shown that questioning might be a viable pedagogical method that combines the advantages of both. When pre-schoolers in the US explored a novel toy with multiple hidden functions, pedagogical questions – questions asked by a knowledgeable teacher who aims to guide children towards learning – have been shown to facilitate more learning and exploration compared to direct instruction or questions asked by a naïve confederate. The current study investigated whether these effects can be observed in Singaporean children’s learning of novel categories. A total of 30 children aged 5–7 (M = 6.51, SD = 0.45) were recruited and randomly assigned to four conditions. In all conditions, children were asked to �nd out the rule for categorising two types of novel robots by exploring exemplars. Before children started exploring, a hint was given either by a teacher in the form of a direct instruction, by the teacher through a question, by a confederate through a question, or not given. We then measured how much the children explored the exemplars and whether they categorised new cards and identi�ed the rules correctly. Results showed no signi�cant difference between any of the four conditions, which may be due to the small sample size. If a larger sample can con�rm the research hypotheses, it will have implications on teachers’ choice of pedagogical methods in early childhood education.


Introduction
In educational psychology, there has been a longstanding debate between two polarities of teaching methods: direct instruction (DI) and discovery learning (DL).Proponents of DI suggest that humans learn from direct instructional guidance provided by knowledgeable others, and research has found that it is superior to DL in ensuring ef cient and effective learning of target information (Al eri et al., 2011;Csibra and Gergely, 2009;Kirschner et al., 2006;Klahr and Nigam, 2004;Mayer, 2004).Meanwhile, advocates of DL propose that humans learn best when they discover information by themselves in a minimally guided environment, which develops curiosity and facilitates further learning (Bruner, 1961;Bruner et al., 1976;Hirsh-Pasek et al., 2009;Singer et al., 2006).
As both pedagogies have their advantages and disadvantages, researchers have posited an alternative pedagogy method that combines the bene ts of both: Questioning.This teaching method falls under enquiry-discovery , which lies in-between DL and DI, and integrates adult guidance into exploratory play of the student (Dobber et al., 2017;Kriewaldt et al., 2021).Wise and Okey (1983) conducted a meta-analysis regarding the effects of 12 teaching methods on students' achievement.Teaching techniques included questioning, teacher direction and discovery.Results showed that questioning had the largest effect size on students' cognitive outcomes.In addition, studies have shown that children explore more when they are faced with con icting evidence compared to evidence that con rms their hypotheses (Schulz and Bonawitz, Reinvention: an International Journal of Undergraduate Research 16:2 (2023) 2007; van Schijindel et al., 2015).This suggests that uncertainty between possible hypotheses, which is the essence of questions, may increase children's self-guided behaviour .
However, different types of questions exist.According to Yu et al. (2018), questions can differ depending on the knowledge state and intention of the questioner.Pedagogical questions (PQ) are questions asked by a knowledgeable person whose intention is to teach the learner.For example, a teacher may ask the child, 'What does this button on the toy do?' in an attempt to teach about that particular toy function.On the other hand, naïve questions (NQ) are questions asked by an unknowledgeable person whose intention is to seek an answer from the questionee.An example would be a naïve informant asking the same question, 'What does this button on the toy do?' in order to nd out the correct answer.
Research has shown that both types of questions elicit different inferences from learners.Based on a pedagogy model , it has been found that human learners interpret differently towards pedagogical and nonpedagogical situations (Shafto et al., 2012;Shafto et al., 2014).In the pedagogical scenario, the learner would infer that the knowledgeable adult has purposefully selected that particular information to be presented, while in the non-pedagogical situation, the learner would infer that the unknowledgeable person has merely chosen a random piece of information.Hence, Yu et al. (2018) predicted children would infer that there is something to be learnt when asked a PQ, but this inference does not occur when they are asked an NQ.The inference is important because it provides an opportunity for children to learn about the target information.
The computational model also provides an explanation for why DI and PQ differ in their effects on learners' exploration.Yu et al. (2018) elaborated that when children are presented with DI, which is pedagogical, they infer that the teacher has purposefully chosen what to instruct -thus what is not chosen does not need to be considered.Indeed, research has found that DI led to less exploration and further learning (Bene ts et al., 2011).Conversely, when children are presented with PQ, Yu et al. (2018) explained that they would question why the teacher chose PQ instead of DI and eventually conclude that there may be more to explore and discover.Thus, Yu et al. (2018) predicted that although both DI and PQ would lead learners to learn about the target information, DI would limit exploration while PQ would encourage children to explore more.
To test out their predictions, Yu et al. (2018) recruited 120 children aged 4 to 5 years and randomly assigned them into four conditions: 1) Direct Instruction (DI), 2) Pedagogical Question (PQ), 3) Pedagogical Question-Overheard (PQO), and 4) Naïve Question-Overheard (NQO).In all conditions, the child is led into a quiet classroom with both an experimenter and a confederate.The experimenter presented the child with a toy and told them that the confederate is unclear about how the toy works.In the DI condition, the experimenter said to the child, 'Push this button to see what happens', before pressing a button to demonstrate its function.In the PQ condition, the experimenter asked the child, 'What happens if you push this button?'In the PQO condition, the experimenter asked the confederate the same question.Finally, in the NQO condition, the confederate was the one who asked the experimenter the question.
After the prompts were given, the experimenter allowed the child to play with the toy and ended the experiment when the child said he or she was done.The whole procedure was videotaped and coded, and results supported both of their hypotheses.Children in the DI, PQ and PQO conditions were signi cantly more likely to learn about the target toy function compared to children in the NQO condition.Additionally, children in the PQ and PQO conditions explored the toy and discovered novel toy functions signi cantly more than children in the DI condition.

Research gap
Reinvention: an International Journal of Undergraduate Research 16:2 (2023) Existing published studies on PQ only focused on Western children (Jean et al., 2019;Yu et al., 2018).A literature search for similar studies conducted with non-Western children came up with no results.However, past research has found that parenting practices vary across cultures.Unlike their US counterparts, mothers from Japan ask signi cantly fewer questions of their infants (Bornstein et al., 1992;Toda et al., 1990).Johnston and Wong (2002) also found that compared to Western mothers, Chinese mothers are signi cantly more likely to believe that children learn best with instructions and less likely to believe that children can learn important information through play.This difference could in uence children's perceptions towards PQ, whereby Chinese and Japanese children are less familiar with their caregivers using questions to teach.
Hence, there may be cross-cultural differences when examining the impact of PQ.This current study aimed to recruit Singaporean children, thus lling up this gap in research.
Furthermore, both Yu et al. ( 2018) and Jean et al. (2019) utilised the novel toy paradigm to measure children's learning of causal relationships -for example, pressing a particular toy button would cause a yellow tower to light up.There is a dearth of research on the effects of PQ on children's learning of new categories, which are commonly taught in formal settings -for example, schools teach children about the difference between a living thing and non-living things.Thus, this current study focused on the latter to test the robustness of the positive effects of PQ.Children would be introduced to two types of novel robots who differ by one determining feature and two non-determining features.

Hypotheses
This study had two hypotheses: 1) that children would learn about the determining feature better in the PQ and DI conditions than in the NQ and DL conditions; 2) that children would explore and discover more nondetermining features in the PQ condition compared to DI condition.

Research ethics
This study was approved by the institutional review board (IRB) in Nanyang Technological University.
Parents were required to ll in a consent form for their child to participate.All the children were given a form to indicate their willingness, using a happy and sad face, to participate as well.

Participants
Thirty children aged 5 to 7 years (M = 6.51,SD = 0.45) were recruited from kindergartens and primary schools in Singapore.They were randomly assigned to one of the four conditions: Pedagogical question (n = 8), direct instruction (n = 8), naïve question (n = 7) and discovery learning (n = 7).Parental consent was acquired before children participated in the experiment.

Materials
The materials used in this study were adapted from Williams and Lombrozo (2013).Planet Zarn was created using a laminated green A3-size paper for the base, corrugated boards for the houses, and coloured paper for the decorations (Figure 1).Daxes and Wugs, the two types of alien robots, were shown to be living on opposite sides of the planet.Demonstration cards (n = 32) were created to depict 16 Daxes and 16 Wugs, who vary in three physical features.The determining feature is the feet of the robots, where all Daxes have pointy feet and all Wugs have at feet (although the exact shape of the feet can differ within each category type).

Procedure
Children were brought into a quiet room in their kindergarten or primary school, and the experimenter asked them by to sit beside a confederate.In all conditions, the experimenter introduced herself as the teacher, and both the child and the confederate as students.Afterwards, the experimenter explained about planet Zarn and the two different types of robots living on it, Daxes and Wugs.The experimenter proceeded to tell the child that his/her task is to gure out the differences between Daxes and Wugs and asked both the child and confederate to guess what the differences were.After guessing, the experimenter followed up with a prompt depending on the experimental condition.In PQ condition, the experimenter asked a question that hinted about the feet ('Which robot would be more likely to fall over?')In DI condition, the experimenter explicitly told the child what the determining feature is ('All Daxes have pointy feet and all Wugs have at feet!')In NQ condition, the confederate asked the question instead of the experimenter.In DL condition, no prompt was given.
The experimenter then allowed the child to explore the demonstration set on his or her own to nd out the differences between both categories of robots.If the child stopped exploring the cards for more than two consecutive seconds or said he or she was nished, the experimenter asked, 'Are you done?' and ended the exploration phase if the child answered 'Yes'.After the child nished exploring, the experimenter presented him or her with the test and transfer cards and told the child to point towards which house they belonged to (Daxes or Wugs).Finally, the child was asked to verbally state the differences between Daxes and Wugs.For each of the three features, one determining (feet) and two non-determining (shape and colour), the child was given a hint if he or she was unable to state any differences.The whole procedure was videotaped.

Coding
All videos were coded by two research assistants according to an agreed coding scheme (see Appendix).For research hypothesis 1, two outcome measurements were coded: number of test and transfer cards the child correctly sorted, and whether the child correctly recognised the determining feature.For research hypothesis 2, three outcome measurements were coded: length of exploration time, the number of cards explored, and whether the child correctly recognised both of the non-determining features.Inter-coder reliability was high for all measurements (number of test and transfer cards correctly sorted: κ = 1; correct recognition of determining feature: κ = 1; length of exploration time: r = .98;number of cards explored: r = .99;correct recognition of non-determining features: κ = .94and 1).

Transmission of target information
In contrast with NQ and DL conditions, children in the DI and PQ conditions were predicted to learn about the determining feature that differentiates Daxes and Wugs.Our results did not support this hypothesis (Figures 3 and 4).There was no signi cant difference observed in the number of test and transfer cards correctly sorted; PQ: 9.75/16, DI: 10.25/16, NQ: 9.29/16, DL: 8.86/16; PQ and DI combined compared to NQ and DL combined, t(26) = .83,p > .05.There was also no signi cant difference observed in the correct recognition of the determining feature: PQ: 1/2, DI: 1.5/2, NQ:.86/2, DL:.86/2; PQ and DI combined vs NQ and DL combined, t(26) = 1.23, p > .05.

Exploration and further learning
We predicted that children in the PQ conditions would explore and discover more non-determining features than children in the DI condition.However, our results did not support this hypothesis (Figures 5, 6 and 7).

Discussion
This study examined the effects of PQ on Singaporean children's learning of new concepts.We hypothesised that 1) children will learn about the determining feature better in the PQ and DI conditions than in the NQ and DL conditions, and 2) children will explore more and discover more non-determining features in the PQ condition compared to the DI condition.Preliminary ndings so far with a small sample do not support either hypothesis.Children in all conditions were equally likely to sort out the correct number of cards and recognise the determining feature.Additionally, children in the DI and PQ conditions did not differ in their exploration time, number of cards explored and their recognition of non-determining features.
Our ndings are inconsistent with the results found by Yu et al. (2018).A plausible explanation could be that Singaporean children, unlike their Western peers, are unfamiliar with PQ as a pedagogical tool.Past research has found that Chinese and Japanese parents asked signi cantly fewer questions and believed that children learn best through instruction compared to Western parents (Bornstein et al., 1992;Johnston and Wong, 2002;Toda et al., 1990).This suggests that Chinese and Japanese parents rarely use questions to teach; hence their children may not have learnt to associate questions with teaching at such a young age.When the knowledgeable 'teacher' in the PQ condition asks Singaporean children a question, they did not view the question as pedagogical, and they did not infer that the 'teacher' intended to teach about the information contained in the question.Without making the inference, the children will not be aware that there is target information to be learnt.This could explain why the question that was supposed to be pedagogical did not have an effect on their learning of the determining feature.However, it is surprising to note that DI did not guarantee children's learning of the determining feature.Prompting Singaporean children with DI should have ensured that they learnt about the target information, considering that DI is often used in Singaporean culture.The children should have had a strong association between DI and teaching to make the aforementioned inference.Future studies could look into this area to determine the reasons why DI was not effective in this study.
PQ also did not lead Singaporean children in this study to explore more.This may be because they do not view the 'teacher's' question as pedagogical; hence they did not question why the 'teacher' chose a question Reinvention: an International Journal of Undergraduate Research 16:2 (2023) instead of DI to teach.As a result, they will not conclude that there may be more to be explored and learnt.
There are a few limitations in this study.First, the number of cards explored may have been coded slightly inaccurately, despite great attempts to ensure the correct codes.During the exploration phase, some children took out a stack of demonstration cards and ipped through them using the side nearest to the robots' feet.
As the camera was lming the children from the experimenter's side, it was dif cult to discern the correct number of cards the child looked at from the video.This may have in uenced the nal results shown.Future studies could look into implementing measures to assist coders in counting the correct number of cards the child viewed.
Second, this study used two-dimensional cards, which does not allow for the children to experiment with the attributes of the robots.Hence, the children have to rely solely on sight and abstract thought to gure out the differences.Future studies can look into using three-dimensional toys that children can feel and touch.
Those toys would have allowed the children to attempt standing the robots up, thus making it easier for them to nd out the differences in feet.
Third, this study used a small sample size.Originally, we aimed for a minimum of 20 children in each condition.However, due to COVID-19 restrictions by the government and participating schools, each of our conditions had fewer than ten children.A small sample size increases the likelihood of a Type 2 error , which occurs when the null hypothesis is incorrectly accepted (Columb and Atkinson, 2016).It is still early to conclude that PQ does not have an effect on Singaporean children's learning of new concepts.Future studies should target a larger sample size for a reliable nal conclusion to be made.
If a larger sample size does con rm both research hypotheses, it would have implications for Singaporean teachers' choice of pedagogical methods in early childhood education.Singapore's Ministry of Education (2012) developed a kindergarten curriculum framework titled 'iteach' , which had the purpose of teaching children through guided play.If PQ was eventually found to be effective, pre-school teachers could incorporate this evidence-based pedagogy method into the current 'iteach' curriculum.In the future, studies could also investigate the effects of PQ across different learning domains -for example, mathematics.We believe PQ to be a promising eld in pre-school education that could improve on the current pedagogy methods used worldwide.

Conclusion
Our results showed that, in this instance, PQ had no effects on Singaporean children's learning of new categories in terms of their learning of the target information and their exploration.
It must be emphasised that the small sample size is insuf cient in testing both research hypotheses and concluding the effectiveness of PQs.Future analysis using a larger sample size is required and PQ is recommended as a topic for further research, considering its potential implications on early childhood education.

Total cards
Count the number of cards that the child touches and looks at.
If the child touches a card but does not look at the robot, not count it as a card (e.g.child takes a whole stack at once but only looks at the top card -this is counted as one rather than the number of cards in the stack).However, if the child looks at the card one by one in the stack (even brie y, like fanning it out and seeing part of each card), count all cards in the stack.
If no. of cards coded by both coders are not too far apart (1-2 cards), take the average as the nal code.

No. of switches
Count the number of times the child switches from one stack to touching the other.
Do not count if the child looks brie y at the other stack.
If unsure (e.g.child holds cards from two stacks at once), do not count it as a switch.

Exploration
General rule of thumb: The start is when the child touches the rst card spontaneously, and the is when child he or she is nished (turns around to look at the experimenter or says he or Reinvention: an International Journal of Undergraduate Research 16:2 (2023) she is done).
If prompting was required by the experimenter at the beginning, do not start counting the time before that.Start counting the time from when the child touched or looked at the cards after prompting from the experimenter.
If the child did not indicate to the experimenter that he is done but it is clear that he or she is no longer exploring (e.g.stares into space), end the exploration time there.
If exploration time between both coders is not too far apart (10s), take the average as the nal code.

Testing phase
General rule of thumb: 2 goes to children who know the correct difference between both robots.This includes if the child says, 'One is at one is pointy' or 'Some are at and some are pointy' and is able to state which is at and which is pointy after the experimenter asks 'Which one is at and which one is pointy?' or gives the hint.As long as the child notices the feature and answers the prompt correctly, code it as 2. This the case when the child rst says, 'No difference', but when the experimenter asks, 'Are there any differences between their feet?',they mention 'shape' and manage to answer the prompt correctly.
1 goes to children who can state the difference after they were given a hint.
0 goes to children who do not know the difference even after a hint was given.If the child says, 'Daxes/Wugs are both blue and yellow', score 0 (ideally this should be clari ed by the experimenter).If the child says square head but round body and vice versa, score 0.
If the child changed his or her answer spontaneously (without getting any cues from the experimenter), code his or her nal answer.If any cue is given from the experimenter that prompted the child's change of answer (can even be a look or 'Huh?'), code the child's rst answer.
Count if the child's response re ects their understanding of the difference (e.g.child says 'one can stand, one cannot stand') even if the speci c characteristic was not stated.

Body shape
Count variations of 'body' (e.g.child says 'head') and variations of shape (e.g.child says 'circle' instead of 'round').

Figure 3 :
Figure 3: Correct sorting of new cards.

Figure 5 :
Figure 5: Length of exploration time.

Figure 3 :
Figure 3: Correct sorting of new cards.

Figure 5 :
Figure 5: Length of exploration time.

Table 1
shows the distribution of the different cards: Reinvention: an International Journal of Undergraduate Research 16:2 (2023)

Table 1 :
Distribution of demonstration cards depicting Daxes and Wugs.Other than the demonstration cards, there are eight test cards and eight transfer cards.The test cards are the same copies of cards from the demonstration set, while the transfer cards follow the determining and nondetermining feature rules but with new feet shapes.The test and transfer cards would consist of four Daxes and four Wugs each.

Table 1 :
Distribution of demonstration cards depicting Daxes and Wugs.