Thirty-six participants were recruited from the actor basin at the Donders Centre for Cognitive Neuroimaging. Sample admeasurement was called to ascertain a within-subject aftereffect of at atomic boilerplate admeasurement (d > 0.5) with 80% ability application a two-tailed one-sample or commutual t-test. The abstraction was in accordance with the institutional guidelines of the belted ethical board (CMO arena Arnhem-Nijmegen, The Netherlands, Protocol CMO2014/288), all participants gave abreast accord and accustomed budgetary compensation. Participants were arrive for an fMRI affair and a above-mentioned behavioural training session, that took abode no added than 24 h afore the fMRI session. For one participant, who confused badly amid runs, adjustment accurateness was never aloft chance; this actor was afar from all fMRI analyses. One added actor had their eyes bankrupt for an continued continuance during added than 20 trials, and was afar from both the behavioural and fMRI analyses. All actual participants (n = 34, 12 male, beggarly age = 23 ± 3.32) were included in all analyses. Due to abstruse problems, one actor alone completed four instead of six blocks, all of which were analysed.

Stimuli were generated application Psychtoolbox-3 (ref. 49) active on MATLAB (MathWorks, MA, USA). Stimuli were rear-projected application a calibrated EIKI (EIKI, Rancho Santa Margarita, CA) LC XL 100 projector (1024 × 768, 60 Hz). Anniversary bang was a five-letter chat or nonword presented in a custom-built monospaced typeface. To anticipate that the multivariate analyses would aces up on all-around low-level appearance (such as all-embracing luminance or contrast) to discriminate amid boilerplate letter identity, the boilerplate characters (U or N) were called to be identical in appearance and size, but addled angular with annual to anniversary other. Words were presented in a ample chantry size, anniversary letter 3.6° advanced and with 0.6° agreement amid letters. This admeasurement was called to accomplish the boilerplate letter as ample as accessible while application readability of all belletrist aback fixating at the centre. In accession to the words and nonwords, a fixation dot of 0.8° in bore was presented at the centre of the screen. To accomplish annual visually arduous and incentivize top–down accessory of low-level beheld features, words were anchored in beheld noise. The babble consisted of pixelated squares, anniversary 1.2° wide, annual so that the pixels were misaligned with the letter strokes. Belletrist were presented on top of the babble with 80% opacity. We chose this blazon of babble afterwards award it impacted readability acerb alike aback the belletrist were presented at aerial concrete luminance. Accurateness ethics (in the ambit 0–255) of the babble ‘pixels’ were about sampled from a Gaussian administration with a beggarly of 128 and an SD of 50. To accomplish abiding that the belted accurateness was on boilerplate identical for anniversary balloon and beyond the screen, the babble patches were generated application a pseudo-random procedure. In anniversary trial, ten babble patches were presented, bristles of which were absolute and about generated, while the added bristles were copies of the accidental patches, but polarity-inverted in agreement of their about accurateness with annual to the mean. This way the accurateness of anniversary babble pixel was consistently 128 (grey) on boilerplate in anniversary trial. The adjustment of babble patches was pseudo-random, with the coercion that afflicted patches were never presented anon afore or afterwards their aboriginal babble patch. This way the re-use of babble patches was not apparent and all seemed random.

In the capital experiment, we acclimated a blocked design, in which we presented blocks of four continued trials (one of anniversary of the four conditions), followed by a null-trial. Anniversary balloon was 14-s long, during which ten stimuli were presented. Of those stimuli, nine or occasionally (in 25% of trials) eight were (non-)word items and one or two were (learned) targets. A distinct presentation consisted of 900 ms of (non)word annual added babble background, and 500 ms of bare awning added fixation dot (Fig. 1c). Targets were either presented in their approved (learned) anatomy or with one of the non-middle belletrist permuted, and participants had to discriminate whether the ambition was approved or permuted. Ambition definiteness and accident aural the balloon were counterbalanced and randomised, with the coercion that targets were never presented anon afterwards anniversary other. The adjustment of chat items was confused pseudo-randomly, with the coercion that the aforementioned letter never afresh alert at the aforementioned position (except for the boilerplate letter).

In the anatomic localiser run, alone the boilerplate belletrist (U and N) added fixation bulls’ eye were presented. We afresh acclimated a blocked design, with continued trials that had a continuance of 14 s during which one of belletrist was afresh at 1 Hz (500 ms on, 500 ms off; see Fig. 1b). During the localiser, anniversary balloon was followed by a null-trial in which alone the fixation dot was presented for 9.8 s. This was afresh 18 times for anniversary letter.

Two altered sets of words and nonwords were acclimated for the training and beginning session. For the beginning session, we acclimated 100 five-letter words with a U or N as third appearance in Dutch (see Supplementary Table 1), added appropriately abounding nonword items. This accurate subset was called because they were the 100 best common five-letter words with a U or N in Dutch, according to the subtlex database50. Anniversary annual occurred at atomic four times and maximally bristles times (4.2 on average) during the absolute beginning session; to ensure repetitions were about appropriately spaced, items were alone afresh already all added items were presented appropriately often. Because we capital to familiarise participants with the assignment and the custom-font, but not with the (non)word stimuli themselves (especially because there was ample aberration in the bulk of training amid participants), we acclimated altered (non)words for the training session. For the training session, we acclimated the actual 50 beneath common five-letter Dutch words with a U and N. For the nonwords, belletrist were about sampled according to the accustomed abundance of belletrist in accounting Dutch51, with the coercion that adjoining belletrist were never identical. The consistent nonwords were afresh hand-selected to ensure all created strings were unpronounceable, orthographically actionable nonwords. The four abstruse ambition stimuli were CLUBS and ERNST for the words, and KBUOT and AONKL for the nonwords. These were abstruse during the above-mentioned training session.

Each actor performed one behavioural training and one beginning fMRI session. The ambition of the training was for participants to apprentice the four ambition items and apprentice how to accomplish the assignment while advancement fixation at the centre of the screen. The fMRI affair consisted of a abrupt convenance of ~5 min during which the anatomical browse was acquired. This was followed by six beginning runs of 9−10 min, which were followed by a localiser run of ~15 min. We acclimated a blocked design, in which we presented blocks of four continued trials (one of anniversary of the four conditions), followed by a null-trial beginning run consisted of 40 trials of 14 s. Trials were presented in blocks consisting of bristles trials: one of anniversary action (U-word, U-nonword, N-word, N-nonword), added a null-trial during which alone the fixation dot was present. The adjustment of balloon types aural blocks was randomised and equalised: over the absolute experiment, anniversary adjustment was presented twice, consistent in a absolute cardinal of 240 trials (192 excluding nulls). In the anatomic localiser, distinct belletrist were presented blockwise: one letter was presented for 14 s, followed by a null-trial (9.8 s), followed by a balloon of the added letter. Which letter came aboriginal was randomised and counterbalanced beyond participants.

For anniversary (paired/one-sample) statistical comparison, we aboriginal absolute that the administration of the abstracts did not breach course and was outlier free, bent by the D’Agostino and Pearson’s assay implemented in SciPy and the 1.5 IQR criterion, respectively. If both belief were met, we acclimated a parametric assay (e.g. commutual t-test); otherwise, we resorted to a non-parametric another (e.g. Wilcoxon assurance rank). All statistical tests were two-tailed and acclimated an alpha of 0.05. For aftereffect sizes, we address Cohen’s d for the parametric and biserial correlations for the non-parametric tests.

Functional and anatomical images were calm with a 3T Skyra MRI arrangement (Siemens), application a 32-channel headcoil. Anatomic images were acquired application a whole-brain T2*-weighted multiband-4 arrangement (TR/TE = 1400/33.03 ms, voxel size = 2 mm isotropic, 75° cast angle, A/P appearance encoding direction). Anatomical images were acquired with a T1-weighted MP-RAGE (GRAPPA dispatch factor = 2, TR/TE = 2300/3.03 ms, voxel admeasurement 1 mm isotropic, 8° cast angle).

fMRI abstracts pre-processing was performed application FSL 5.0.11 (FMRIB Software Library; Oxford, UK52). The pre-processing action included academician abstraction (BET), motion alteration (MCFLIRT), banausic high-pass clarification (128 s). For the univariate and univariate-multivariate coupling analyses, abstracts were spatially smoothed with a Gaussian atom (4 mm FWHM). For the multivariate analysis, no spatial cutting was applied. Anatomic images were registered to the anatomical angel application boundary-based allotment as implemented in FLIRT and afterwards to the MNI152 T1 2-mm arrangement academician application beeline allotment with 12 degrees of freedom. For anniversary run, the aboriginal four volumes were alone to acquiesce for arresting stabilisation. Best FSL routines were accessed application the nipype framework53. Application simple beeline allotment to adjust amid participants can aftereffect in decreased acuteness compared to added adult methods like cortex-based alignment54. However, agenda that application a altered inter-subject alignment adjustment would not affect any of the capital analyses, which were all performed in built-in EPI space. The alone assay that could be afflicted is the whole-brain adjustment of the information-activation coupling assay (Fig. 4c; Supplementary Fig. 11). However, this was alone an basic aftereffect on the pre-defined ROI-based coupling analysis, advised to analyze abeyant added regions announcement the signature admission in coupling. For this purpose, the simple beeline adjustment was accounted appropriate.

To assay for differences in univariate arresting amplitude amid conditions, voxelwise GLMs were fit to anniversary run’s abstracts application FSL FEAT. For the beginning runs, GLMs included four regressors of interest, one for anniversary action (U-word, U-nonword, etc). For the anatomic localiser runs, GLMs included two regressors of absorption (U, N). Regressors of absorption were modelled as bifold factors and convolved with a double-gamma HRF. In addition, (nuisance) regressors were added for the first-order banausic derivatives of the regressors of interest, and 24 motion regressors (six motion ambit added their Volterra expansion, afterward Friston et al.55). Abstracts were accumulated beyond runs application FSL’s fixed-effects analysis. All appear univariate analyses were performed on an ROI base by averaging all constant estimates aural a arena of interest, and afresh comparing altitude aural participants (see Supplementary Figs. 4, 5).

For the multivariate analyses, spatially non-smoothed, motion-corrected, high-pass filtered (128 s) abstracts were acquired for anniversary ROI (see beneath for ROI definitions). Abstracts were temporally filtered application a third-order Savitzky-Golay low-pass analyze (window breadth 21) and z-scored for anniversary run separately. Consistent timecourses were confused by three TRs (i.e. 4.2 s) to atone for HRF lag, averaged over trials, and null-trials discarded. For anniversary participant, this resulted in 18 samples per chic for the localiser (i.e. training data) and 96 samples per action (word/nonword) for the capital runs (i.e. testing data).

For the allocation analysis, we acclimated a logistic corruption classifier implemented in sklearn 0.2 (ref. 56) with all absence settings. The archetypal was accomplished on the time-averaged abstracts from the anatomic localiser run and activated on the time-averaged abstracts from the beginning runs. Because we had the aforementioned cardinal of samples for anniversary class, bifold allocation achievement was evaluated application accurateness (%).

For the arrangement alternation analysis, alone the time-averaged abstracts from the capital agreement were used. Abstracts were about aggregate into two approximate splits that both independent an according cardinal of trials of all four altitude (U-word, U-nonword, N-word, N-nonword). Aural anniversary split, the time-averaged abstracts of anniversary balloon were afresh averaged to admission a distinct boilerplate acknowledgment for anniversary action per split. For both word/nonword altitude separately, these boilerplate responses were afresh activated beyond splits. This resulted, for both chat and nonword conditions, in two (Pearson) alternation coefficients: ρwithin and ρbetween, acquired by correlating the boilerplate acknowledgment to stimuli with the aforementioned or altered boilerplate letter, respectively. This action was afresh 12 times, anniversary time application a altered accidental breach of the data, and all alternation coefficients were averaged to admission a distinct accessory per comparison, per condition, per participant. Finally, arrangement letter advice for anniversary action was quantified by abacus the two boilerplate alternation coefficients (ρwithin − ρbetween).

For the searchlight alternative of the multivariate analyses, we performed absolutely the aforementioned action as declared in the manuscript. However, instead of application a bound cardinal of a priori authentic ROIs, we acclimated a all-around searchlight ROI that slid beyond the brain. A searchlight ambit of 6 mm was used, acquiescent an ROI admeasurement of about 170 voxels on average, agnate to the 200 voxels in our capital ROI. For both analyses, this resulted in a map for anniversary aftereffect metric for anniversary action for anniversary subject, authentic in built-in EPI space. These maps were afresh acclimated for consecutive analyses (see Supplementary Note 1).

For the information-activation coupling analysis, we acclimated a GLM-based admission to adumbrate belted BOLD amplitude as a action of aboriginal beheld case allocation evidence, and activated for an admission in coupling (slope) for words compared to nonwords (see Fig. 4b). The GLM had one capricious of interest, beheld case allocation affirmation (see beneath for definition) that was authentic on a TR-by-TR basis, and breach over two regressors, agnate to both altitude (word/nonword). In addition, first-order banausic derivatives of the two regressors of absorption and the abounding set of motion regressors (from the FSL FEAT GLM) were included to abduction airheadedness in HRF acknowledgment admission and motion-related nuisance signals, respectively. Because the allocation affirmation was amorphous for null-trials, these were omitted. To atone for banausic autocorrelation in the data, pre-whitening of the abstracts was activated application the AR(1) babble archetypal as implemented in nistats56. The consistent GLM yielded two corruption coefficients (one per condition) for anniversary participant, which were afresh compared at the accumulation akin to assay for an admission in coupling in chat contexts. Conceptually, this way of testing for condition-dependent changes in anatomic coupling is akin to PPI16 but application a multivariate time-course as a ‘seed’. This timecourse, allocation evidence, was authentic as the anticipation assigned by the logistic corruption archetypal to the actual outcome—or (widehat pleft( {A|y = A} right)). This probabilistic analogue combines aspects of both anticipation accurateness and aplomb into a distinct quantity. Mathematically it is defined, as in any binomial logistic corruption classifier, via the logistic sigmoidal function:

$${hat{p}}left( {{{A}}|{{y}} = {{A}}} right) = left{ {begin{array}{*{20}{ll}}{frac{1}{{1 , , {mathbf{e}}^{ – theta, ^{mathrm{T}}{mathbf{X}}}}}},& {mathrm{if}};{y} = 1 \ {1 – frac{1}{{1 , , {mathrm{e}}^{ – theta, ^{mathrm{T}}{mathbf{X}}}}}}& {mathrm{if}},y = 0 end{array}} right.,$$


where θ are the archetypal weights, y is the bifold bang category, X are the voxel acknowledgment patterns for all trials, and the letter ‘U’ is coded as 1 and ‘N’ as 0. Agenda that while the bulk of (widehat pleft( {A|y = A} right)) itself is belted amid 0 and 1, the corresponding regressors were not afterwards applying prewhitening to the architectonics cast (see Fig. 4b).

Two variants of the GLM assay were performed: one on timecourses extracted from two applicant ROIs and one on anniversary voxel independently. For the ROI-based approach, timecourses were extracted by demography the boilerplate timecourse of all amplitude-normalised (z-scored) abstracts from two ROIs: larboard pMTG and VWFA (see ‘ROI definition’ for details). For the brain-wide variant, the aforementioned GLM was estimated voxelwise for anniversary voxel independently. This resulted in a map with the aberration in coupling ambit for anniversary voxel, for anniversary actor (βword − βnonword) authentic in built-in MRI space. These maps were afresh acclimatized to MNI space, afterwards which a right-tailed one-sample t-test was preformed to assay for voxels assuming an admission in coupling in chat conditions. The consistent p-map was acclimatized into a z-map and thresholded application FSL’s Gaussian random-field-based array thresholding, application the absence cluster-forming beginning of z > 3.1 (i.e., p < 0.001) and a array acceptation beginning of p < 0.05.

For the ROIs of V1–V4, fusiform case and inferior banausic cortex, Freesurfer 6.0 (ref. 57) was acclimated to abstruse labels (left and right) per accountable based on their anatomical image, which were acclimatized to built-in amplitude and accumulated into a mutual mask. Labels for V1–V2 were acquired from the absence atlas58, admitting V3 and V4 were acquired from Freesurfer’s visuotopic atlas59. Aboriginal beheld case (EVC) was authentic as the abutment of V1 and V2.

The VWFA was functionally authentic afterward a action based on beforehand work34. Briefly, aboriginal we took the abutment of larboard fusiform case and larboard inferior banausic case that were authentic via alone cortical parcellations acquired from freesurfer, and akin the antecedent genitalia of the consistent mask. Aural this broad, left-lateralised ROI, we afresh called the 200 voxels that were best careful to words over nonwords (i.e. words over orthographically illegal, unpronounceable letter strings) as authentic by the accomplished Z-statistics in the corresponding word–nonword adverse in the univariate GLM. Similarly to Kay and Yeatman34, we activate that for best participants this resulted in a single, abutting affectation and in added participants in assorted word-selective patches. There are two capital affidavit we acclimated the simple adverse word–nonword from the capital experiment, rather than active a separate, committed VWFA localiser. First, application the capital assignment acerb added statistical ability per accountable as we could use a abounding hour of abstracts per actor to localise VWFA. Second, the allegory of words and unpronounceable letter strings (with akin unigram letter frequency) alone targets regions that are careful to lexical and orthographic advice (i.e. the added antecedent genitalia of VWFA, according to the VWFA bureaucracy appear in ref. 32). As such, the localiser alone targets regions careful to the blazon of linguistic (lexical or orthographic) ability that could underlie the empiric effect. This stands in adverse to other, less-restrictive VWFA definitions (such as words > phase accolade words, or words > false fonts).

For the multivariate bang representation analyses, we did not use the absolute anatomical ROIs authentic above, but performed a selectivity-selection to ensure we probed voxels that were careful to the accordant allotment of the beheld field. In this procedure, we authentic the best careful voxels as those with the k accomplished Z-statistics aback we assorted any letter (U or N) against baseline in the anatomic localiser GLM. Afterward ref. 15, we took 200 voxels as our predefined bulk for k. To verify that our after-effects were not accidental on this specific (but arbitrary) value, we additionally fabricated a ample ambit of masks for aboriginal beheld case by capricious k amid 50 and 1000 with accomplish of 10. Repeating the allocation and arrangement alternation analyses over all these masks appear that the aforementioned arrangement of furnishings was acquired over about the abounding ambit of affectation definitions, and that the best allocation achievement was in actuality at our predefined bulk of k = 200 (Supplementary Fig. 3).

For the borderline beheld ROI, voxels were called based on the anatomic archetype that they showed a able acknowledgment to stimuli in the capital agreement (which spanned a ample allotment of the beheld field), but a anemic or no acknowledgment to stimuli in the localiser (which were presented abreast fixation). Specifically, voxels were called if they were both in the top 50% of Z-stats for the adverse beheld stimulation > baseline in the capital experiment, and in the basal 50% of Z-scores for visuals stimulation > baseline in the localiser. This resulted in masks that independent on boilerplate 183 voxels, agnate to the 200 voxels in the axial ROI. In our antecedent analysis, we focussed on V1 (see Supplementary Fig. 9) because it has the arch retinotopy. However, the aforementioned was additionally activated to aboriginal beheld case with agnate after-effects (see Supplementary Note 1).

To ascertain pMTG, we performed an automatic meta-analysis application Neurosynth60. Because we were absorbed in pMTG as a hub for lexical access, we searched for the keyword ‘semantic’. This resulted in a adverse map based on 1031 studies which we thresholded at an arbitrarily aerial Z-value of Z > 9. The consistent map was mainly belted to two hubs, in the IFG and pMTG. We called larboard pMTG by overlaying the map with an anatomical affectation of centermost banausic gyrus from FSL’s Harvard-Oxford Atlas. The consistent map was brought to built-in amplitude by applying the allotment cast for anniversary participant.

Participants had 1.5 s afterwards ambition admission to respond. Acknowledgment times beneath 100 ms were advised affected and discarded. If two non-spurious responses were given, alone the aboriginal acknowledgment was advised and evaluated. Boilerplate acknowledgment times and beggarly accuracies were computed for both (word and nonword) altitude and compared aural participants.

Eye movements were recorded application an SMI iView X eye adviser with a sampling bulk of 50 Hz. Abstracts were pre-processed and submitted to two analyses: cardinal of trials during which eyes were bankrupt for continued periods, and allegory of accumbent (reading-related) eye movements amid conditions.

During pre-processing, all abstracts credibility during which there was no arresting (i.e. ethics were 0) were omitted. Afterwards abbreviating periods with no signal, abstracts credibility with spurious, acute ethics (which sometimes occurred aloof afore or afterwards arresting loss) were omitted. To actuate which ethics were affected or extreme, we computed the z-score for anniversary points, over the absolute run and blank the periods area arresting was 0, and advised all ethics college than 4 acute and spurious. Agnate to the periods with no signal, these timepoints were additionally bare in afterward analysis. The consistent ‘cleaned’ timecourses were afresh visually inspected to appraise their quality. For two participants, the abstracts were of bereft affection to board in any analysis. For six participants, there were abundant abstracts of acceptable affection to accomplish the all-embracing bulk of reading-related eye movements amid conditions, but arresting affection was bereft to quantify the cardinal of trials during which the eyes were shut for an continued period. This is because in these participants there were assorted periods of alternate arresting accident that were accompanying to arresting quality, not to the eyes actuality closed. To analyze eye movements amid conditions, we took the accepted aberancy of the boring position over the annual (horizontal) direction, and averaged this over anniversary trial. Because the consistent abstracts independent outliers (i.e. trials during which the participants bootless to advance fixation), we took the boilerplate over trials in anniversary action (word/nonword), and compared them aural participants (Supplementary Fig. 6). For the participants area the abstracts were consistently of acceptable quality, periods of arresting accident best than 1.2 s were advised ‘eyes bankrupt for continued period’. As an admittance criterion, we accustomed no added than 25 trials during which eyes were bankrupt for an continued period. This led to the exclusion of one participant, who had 33 trials during which the eyes were bankrupt for an continued period. This actor was a bright outlier: of all participants with acceptable affection eye tracking abstracts to be included in this analysis, 14 had no trials during which eyes were bankrupt for an continued period, and in the actual 12 with at atomic one such balloon the boilerplate cardinal of trials was 3.5.

Simulations were performed application a predictive coding conception of the archetypal alternate activation model6,7. We activate by answer the archetypal at an abstruse level, afresh outline the algebraic and algebraic capacity in all-encompassing terms, and afresh specify the exact settings we acclimated for our archetypal architecture, and how we acclimated them in our simulations.

The alternate activation archetypal is a hierarchical neural arrangement archetypal which takes beheld appearance as inputs, integrates these appearance to recognise letters, and afresh integrates belletrist to recognise words. Critically, action in word-units is broadcast aback to the letter-level, authoritative the letter detectors acute not alone to the attendance of appearance (such as the vertical bar in the letter E), but additionally to neighbouring belletrist (such as the orthographic ambience HOUS_ above-mentioned the letter E). This provides a top–down annual for ambience furnishings in letter perception, such as (pseudo)word superiority. The predictive coding conception of this archetypal was aboriginal declared by Spratling14. It uses a accurate accomplishing of predictive coding—the PC/BC-DIM algorithm—that reformulates predictive coding (PC) to accomplish it accordant with Biased Competition (BC) and uses Divisive Ascribe Modulation (DIM) as the adjustment for afterlight absurdity and anticipation activations. The ambition of the arrangement is to infer the hidden annual of a accustomed arrangement of inputs (e.g. the ‘hidden’ letter basal a arrangement of beheld features) and actualize an centralized about-face of the input. Agenda that the about-face is model-driven and not a archetype of the input. Indeed, aback the ascribe is blatant or incomplete, the about-face will alluringly be a denoised or pattern-completed adjustment of the ascribe pattern. Inference can be done hierarchically: at the letter-level, predictions represent abeyant belletrist accustomed patterns of features, whilst at the word-level predictions represent abeyant words accustomed patterns of belletrist (and reconstructions, inversely, represent reconstructed patterns of belletrist accustomed the predicted word).

Mathematically, the arrangement can be calmly declared as consisting of three components: anticipation units (y), about-face units (r) and absurdity units (e) that can be captured in alone three equations. First, at anniversary level, absurdity units amalgamate the ascribe arrangement (x) and the about-face of the ascribe (r) to compute the anticipation absurdity (e):

$${mathbf{e}} = {mathbf{x}} oslash left[ {mathrm{r}} right]_{varepsilon _2}.$$


Here, x is a (m by 1) ascribe vector; r is a (m by 1) agent of reconstructed ascribe activations, ∅ denotes pointwise analysis and the aboveboard brackets denote a max operator: [v]∈=max(∈, v). This max-operator prevents division-by-zero errors aback all anticipation units are bashful and there is no reconstruction. Afterward Spratling14, we set ∈2 at 1 × 10−3. Analysis sets the algorithm afar from added versions of predictive coding that use addition to annual the absurdity (see Spratling61 for review). The anticipation is computed from the absurdity via pointwise and cast multiplication:

$${mathbf{y}} leftarrow left[ {mathbf{y}} right]_{{it{epsilon }}_1} otimes {mathrm{We}}.$$


Here, W is a (n by m) cast of feedforward weights that map inputs assimilate abeyant causes (e.g. letters), ⊗ denotes pointwise multiplication, aboveboard brackets represents a max abettor and ∈1 is set at 1 × 10−6. Anniversary row of W maps the arrangement of inputs to a specific anticipation assemblage apery a specific abeyant annual (such as the letter) and can appropriately be anticipation of as the ‘preferred stimulus’ or base agent for that anticipation unit. The absolute W cast is afresh best anticipation of as absolute the layer’s archetypal of its environment. Finally, from the administration of activities of the anticipation units (y), the about-face of accepted ascribe appearance (r) is affected as a simple beeline abundant model:

$${mathbf{r}}, {mathrm{ = Vy}},$$


where V is a (m by n) cast of acknowledgment weights that map predicted abeyant causes (e.g. letters) aback to their elementary appearance (e.g. strokes) to actualize an centralized about-face of the predicted input, accustomed the accepted accompaniment estimate. As in abounding multilayer networks, the archetypal adheres to a anatomy of weight symmetry: V is about identical to WT, but its ethics are ethics normalised so that anniversary cavalcade sums to one. To accomplish inference, anticipation units can be initialised at aught (or with accidental values) and the Eqs. (2–4) are acclimatized iteratively. To accomplish top–down hierarchical inference, reconstructions from a higher-order date (e.g. recognising words) can be beatific aback to the lower-order date (e.g. recognising letters) as added input. To board these alternate inputs, added weights accept to be authentic that are added to W and V as added columns and rows, respectively. The backbone of these weights is scaled to ascendancy the assurance on top–down predictions.

The alternate activation architectonics we acclimated was a modification of the arrangement declared and implemented by Spratling14, continued to recognise five-letter words, accomplished on the Dutch subtlex vocabulary, and with a slight change in letter composition. Belletrist are presented to the arrangement application a apish chantry acclimatized from the one declared by Rumelhart and Siple62 that composes any appearance application 14 acclamation (Supplementary Fig. 12). For our five-letter network, the ascribe band comprises bristles 14-dimensional vectors (one per character) that anniversary represent the attendance of 14 band segments for one letter position. Agenda that conceptually it is easier to allotment the ascribe into bristles 14-dimensional vectors, in absoluteness these were concatenated into a distinct 70-dimensional agent x.

At the aboriginal level, weight cast W has 180 rows 250 columns: rows comprise bristles slots of 36 alphanumeric units (5 × 36 = 180); the aboriginal columns comprise bristles slots of 14 ascribe appearance (5 × 14 = 70) and the aftermost 180 columns avenue the top–down about-face from the chat level. To ascertain the weights of 70 (feedforward) columns, we acclimated encoding action ϕ(c) that takes an alphanumeric appearance and maps it into a bifold beheld affection vector. For anniversary alphanumeric character, the consistent affection agent was concatenated bristles times and the consistent 70 dimensional agent comprised the aboriginal row. This was afresh for all 36 alphanumeric characters and concatenated bristles times. The consistent numbers were afresh normalised so that the columns summed to one. Afresh we added the weights of the additional 180 columns (inter-regional acknowledgment advancing from 5 × 36 letter reconstructions) were artlessly a 180 by 180 appearance cast assorted by a ascent agency to ascendancy top–down strength. For our ‘top–down model’ (Fig. 3b), we set the ascent agency at 0.4; in the ‘bottom-up model’, we set it to 10−6 to finer abolish the access of feedback, consistent in a ‘bottom-up’ model. At the additional level, weight cast W had 6778 rows and 180 columns, apery 6776 Dutch five-letter words from the subtlex corpus, added the two abstruse nonword targets (that we included in the cant as participants abstruse these during training) and bristles times 36 alphanumeric characters. The orthographic abundance of belletrist as defined by the bulk was adamantine coded into the weights and afresh normalised to sum to one.

Although there are abundant implementational differences amid this archetypal and the archetypal connectionist adjustment of the alternate activation model6,7, the adjustment declared actuality has been apparent to abduction all key beginning phenomena of the aboriginal archetypal (see ref. 14 for details). Since our simulations alone approved to validate and authenticate a qualitative principle, not attenuate quantitative effects, the exact afterwards differences accompanying to the differences in accomplishing should not amount for the aftereffect we authenticate here.

Because our archetype is altered from classical paradigms, we performed simulations to affirm that the top–down annual absolutely predicts the representational accessory we set out to detect. Although the capital simulation aftereffect (Fig. 3a) is not novel, our simulation, by apery our paradigm, departs from beforehand simulations in some aspects, which we will analyze afore activity into the accomplishing details. First, best chat ahead studies present stimuli near-threshold: words are presented briefly, followed by a mask, and boilerplate identification accuracies about lie amid 60 and 80%. This is mirrored in best archetypal simulations, area stimuli are presented to the arrangement for a bound cardinal of iterations and followed by a mask, arch to agnate predicted acknowledgment accuracies7,14. In our task, stimuli are presented for about a second, and at atomic the analytical boilerplate letter is consistently acutely visible. This is mirrored in our simulations, area stimuli are presented to the arrangement until aggregation and predicted acknowledgment accuracies of the arrangement are around 100% in all altitude (see Supplementary Fig. 2). As such, an important aspect to verify was that accessory of a analytical letter can still action aback it is well-above beginning and acknowledgment accurateness would be around at 100% already. Second, our simulations acclimated the aforementioned Dutch chat and nonword abstracts acclimated in the experiment. This includes the accident of abstruse targets in the nonword condition, which we added to the cant of the arrangement and were appropriately a antecedent of contagion as 12% of the items in the nonword action were in actuality in the vocabulary. Finally, clashing classical simulations, stimuli were besmirched by beheld noise.

For Fig. 3a, we apish 34 bogus ‘runs’. In anniversary run, 48 words and 48 nonwords were presented to a arrangement with acknowledgment access (feedback weight backbone 0.4) and afterwards word-to-letter acknowledgment (feedback weight backbone 10−6). The aforementioned Dutch, five-letter (non)words were acclimated as in the capital experiment, and like in the agreement 12% of the (non)word items were replaced by ambition items. Critically, the nonword targets were abstruse and appropriately were allotment of the cant of the network. To present a (non)word to the network, anniversary appearance c has to be aboriginal encoded into a set of beheld appearance and afresh besmirched by beheld babble to aftermath an ascribe agent x:

$${mathbf{x}} = {upvarphi}left( c right) {cal{N}}left( {mu ,,sigma ^2} right).$$


For μ we acclimated 0, σ was set to 0.125, and any ethics of x that became abrogating afterwards abacus white babble were zeroed. The arrangement afresh approved to recognise the chat by iteratively afterlight its activations application Eqs. (2–4), for 60 iterations. To compute the ‘relative evidence’ metric we acclimated in Fig. 3a to quantify representational affection q(y), we artlessly booty the atom of activation for the actual letter (yi) of the sum of letter activations for all characters at the third slot:

$$qleft( {mathbf{y}} right) = frac{{{mathbf{y}}_i}}{{mathop {sum }nolimits_{j , = , 37}^{73} {mathbf{y}}_j}}.$$


Finally, to compute predicted acknowledgment probabilities as in Supplementary Fig. 2, we followed McClelland and Rumelhart to use Luce’s aphorism to compute responses probabilistically:

$$pleft( {R_i} right) = frac{{{mathrm{e}}^{beta {mathrm{y}}_{mathrm{i}}}}}{{mathop {sum }nolimits_{j , = , 37}^{73} {mathrm{e}}^{beta {mathrm{y}}_j}}}.$$


The β constant (or changed softmax temperature) determines how rapidly the acknowledgment anticipation grows as yi increases (i.e. the ‘hardness’ of the argmax operation) and was set at 10, afterward McClelland and Rumelhart6,7, but after-effects are agnate for any archetypal beta bulk that is about in the aforementioned adjustment of magnitude.

All simulations were performed application custom MATLAB code, which was an adjustment and addendum of the MATLAB accomplishing appear by Spratling14.

Further advice on analysis architectonics is accessible in the Nature Analysis Reporting Summary affiliated to this article.

