Abstract
The size of India’s food deficit became a pressing question for the Indian state in the early years of independence. As different organizations, government bodies, and individuals debated over the ways, means, and expertise needed to tide over the food crisis, policymakers realized that the primary requirement was to have a numerical understanding of the problem. Data became crucial to accurately assess production trends and compare them with requirements. This article looks into the use of statistical methods, particularly, random sampling and production estimation through a crop-cutting technique. Exploring the statistical survey work done by P.C. Mahalanobis in Bengal from the late years of colonial rule to the surveys conducted by the Indian Council of Agricultural Research under the supervision of P.V. Sukhatme and V.G. Panse, the article analyzes how different factors, such as varying revenue systems of different regions and administrative structures, power struggles amongst statisticians, and leverage gained by Indian statisticians from support they received from better known British counterparts, all played a role in determining the nature of statistical tools adopted in India to measure its food production. Inaccurate data continued to be a challenge for the Indian state until well into the late 1950s, and that can now be explained in terms of this discord between Mahalanobis-led Kolkata-ISI and the ICAR of Sukhatme’s time. India continued to follow different methods of statistical survey of foodcrops, thus, the scientific/political establishment always struggled with the apprehension that they did not have the ‘right’ data to come up with the correct assessment of the scene.
Keywords
At independence in 1947, India’s experience with the science of random sample surveys was rather limited (Didier, 2020, pp. 6–7). Spearheading such surveys in India was a small, yet highly proficient, group of Indian statisticians that included Prasanta Chandra Mahalanobis (1893-1972) of the Indian Statistical Institute (ISI) in Kolkata and Pandurang Vasudeo Sukhatme (1911-1997) of the Indian Council of Agricultural Research (ICAR). They played pioneering roles in organizing statistical surveys—putting together the manpower, laying down the plans for the survey, and coordinating with various departments of the Central and the state governments. The careers of these two statisticians overlapped as the independent Indian government worked towards building up its data repository on a range of topics from food-crop production to the birth rate of its population to facilitate intelligent planning (Menon, 2022). However, instead of collaborating, when the time came to assess the yield of food crops in the country, they entered into a long-drawn debate over a specific technical aspect of random sampling: the shape and size of cuts for field surveys. Mahalanobis recommended circular cuts of a radius of 4 feet for yield surveys, which Bengal and Bihar followed. Against this, Sukhatme vociferously argued for rectangular cuts of 33 feet in length and 16.5 feet in width. This debate over the cut size would continue across the colonial divide and into the late 1960s, embroiling statisticians in India and the Western world. This article looks into the debate over the cut size of survey plots to understand the historical contexts of the different statistical approaches used in surveying rural India as the country transitioned from a colonial economy to a planned economy. Through a historical analysis of the debate, the article also sheds light on how Sukhatme and Mahalanobis, as well as the respective organizations of which they were part, sought key roles in the production of statistical knowledge to understand India’s food situation.
A historical study of the debate over the cut size of survey plots allows us to trace the evolution of statistical surveys in independent India and gain an understanding of how competing statisticians aspired to make themselves, their institutions, and the data they collected relevant to the developmental planning of the country. If the planners used data collected by one institution, that gave statisticians of that institution considerable control over the development regime; the regime used statistical data in interpreting current conditions and based their development plans on it. 1 At the surface, the debate between Mahalanobis and Sukhatme was about technical differences, but below that was a competition over who would exercise control over planning through collecting, collating, and interpreting data for the planners. The debate, therefore, reflects struggles among members of the scientific communities for government patronage within the evolving matrix of post-colonial state-building. 2 This historical analysis of a debate within the statistical community helps us shed light not only on the dynamic growth of the discipline and cognitive differences among its members, but also on how institutions, its members and protagonists competed to play an instrumental role ‘in defining the techniques by which the economy was made intelligible to the state’ (Menon, 2022, p. 6). In post-independence India, the country witnessed a simultaneous rise in planning and national statistics; statistical surveys turned out to be key to the planning process, and became imperative for a statistician’s data to be counted as relevant by the state for its developmental needs. 3 As Mahalanobis and Sukhatme disagreed over ways to conduct statistical surveys, and with their respective institutions conducting separate surveys, it became a matter of huge debate as to whose data would be considered reliable by the Indian state.
Both Mahalanobis and Sukhatme mobilized their international networks to support their respective claims. They employed rhetorics of accuracy, efficiency, and economical means to convince members of political and technical domains of their side of the argument. Accuracy, efficiency, and economy became the key terms for one group to claim legitimacy over the other. Considering the professional statures of these two premiere statisticians and how each had Western scientists backing their respective positions, the Indian state did not take sides or even force a reconciliation between the two debating sides. Rather, because of their technical skills, networking capabilities, and institution-building experiences, the experts made themselves sufficiently valuable to the state to force it to accommodate differences by separate allocations of areas of the survey between the two statisticians. As explained below, the regions of India that had a history of revenue collection through the zamindari system went with Mahalanobi’s method of crop-cutting, whereas the areas using the ryotwari system adopted the ICAR methods championed by Sukhatme. This piece of history, therefore, brings to light how the development regime—to serve its needs—didn’t always impose or dictate but followed a strategy of negotiation that helped maintain a balance of power among experts—its instruments of development—without alienating or losing control over them.
So far, only members of the statistical community have taken an interest in the technical details of the debate among these statisticians. 4 However, the historical analysis in this article reveals that issues involved in the debate go beyond any cut-and-dried take on technical accuracy that could interest only a trained statistician. The debate, seen through the lens of the social history of statistics, seems to be less about who had the correct statistical method to achieve mathematical accuracy and, rather, more about the historical and material contexts and the professional and political network within which the statisticians operated. 5 These factors collectively shaped their preference for statistical techniques and informed their ideas of accuracy, efficiency, and economical means. This article sheds light on the yield surveys’ connections with the political-economic world of the colonial era, inherited by the newly independent country that was yet to replace the colonial revenue system. To understand why the differences emerged in the first place, the article goes back to the first quarter of the twentieth century, the operations of the colonial revenue system, and the work of colonial surveyors, weaving these into a discussion of the post-colonial context defined by large-scale centralized planning and an urgent need for data. At the same time, the Indian state and its statistical community remained constrained by the paucity of trained manpower and financial resources to undertake the surveys efficiently. These material factors shaped the respective methods of statisticians and their understanding of the concepts of accuracy, efficiency, and economy.
This article looks at statistical actors, processes, and networks and see these to be ‘crisscrossed by relations of power’ (Ramos Pinto & Paidipaty, 2020, p. 418). The science of statistical survey as it was practised in India reflects its imbrication with the political system and economic institutions of the geographical space within which it operated. More generally, statistics serves the political needs of the state, particularly, in helping the state know what kind of country it is ruling and how it seeks to change (A. Ghosh, 2020, pp. 9–13). Numbers offer a tool of persuasion and a basis for rational, methodical, calibrated, and repeatable action that remains unmatched (Porter, 1995). As a tool in the hands of the state, aiding policymaking, governance and comprehension, statistics has been very powerful. Hacking (1991, p. 181) insightfully comments that the collection of statistics not only provides information but it is itself part of the ‘technology of power’ in a modern state. The states of the early nineteenth century and of contemporary times have coveted this ‘powerful’ tool (A. Ghosh, 2020, p. 29). For a statistical institute or agency to be in a position where it could become the sole provider of this powerful tool to the state was very empowering in itself. This article contextualizes the debate between Mahalanobis and Sukhatme and the subsequent division of survey territories between the National Sample Survey (NSS) and state agencies within the larger question of the exercise of power. While the materiality of the survey shaped their initial approach, their differences were sustained by the control they wanted to exercise over the supply of data to the state.
Statistical surveys in colonial India: Random sampling replaces guesswork
When Orissa became a separate province in 1936, Sir John Austen Hubback, a British civilian and a distant relative of the famous British author Jane Austen, was appointed its first governor. Hubback had been serving the Orissa state in different capacities for 15 long years before being appointed the governor of the province. He was a much-respected man among the elites of the region, having played an inspiring role in patronizing the Oriya language and literature. 6 The other activity that brought him much respect from contemporary statisticians was his work on crop surveys. In 1923–1925, Hubback applied random sampling and crop-cutting methods to undertake statistical surveys of rice yield in Bihar and Orissa. In his report, ‘Sampling for rice yield in Bihar and Orissa (1923–25)’, Hubback (1946, p. 283) commented at length on ‘the unreliable character’ of existing official methods of estimating crop yields in terms of making a guess. 7 Embellished with a healthy dose of sarcasm, Hubback (1946, p. 283) wrote: ‘The method is comparable to estimating the average income of the population of a town by watching the streets for a few days and then picking out a man who looked to be in average circumstances and discovering what his income is.’ It is imperative to remember that Hubback’s words were not the condescension of a Westerner for the oriental non-performance; his criticism was directed against ‘the official method’ which had been very much the doing of the British officials associated with survey works of the past.
Hubback’s crop-cutting experiment in Orissa has been widely recognized as the earliest in the world. Information on his survey work spread across the world, heralding a dynamic age of statistical techniques (Mahalanobis, 1946a, pp. 269–280). Hubback’s work found an application in the heartland of India in the hands of fellow Indian civil servants Chintamani Deshmukh and P.S. Rau. 8 Deshmukh, who topped the civil service exam and became the first Indian Governor of the Reserve Bank of India in 1943, was appointed the Deputy Commissioner of Raipur at the start of his career, where he helped to prepare the Forecast Report of Resettlement in 1926. In his memoir, Deshmukh wrote about introducing Hubback’s method in Raipur. Thousands of samples, which were generated over four successive seasons, were analysed to deduce the average yield for each soil class. The soil classes, on the other hand, were prepared by Deshmukh in association with D.V. Bal who was the Soil Chemist to the Government of Nagpur at that point. The comprehensive report that Deshmukh and Rau prepared on average crop yield was part of the 1931 settlement report of the region. In the history of the development of random sampling in India, ‘this work was a landmark’, Deshmukh reflected a few years later. (Deshmukh, 1974, p. 76).
Around the same time, P.C. Mahalanobis, who was at that time teaching at the renowned Presidency College of Calcutta, laid the foundation of the ISI on the outskirts of Calcutta. The Institute would train a generation of statisticians of world stature who, through wide use of the sampling method in social and economic research, would make India one of the ‘foremost users’ of the same. (Rudra, 1998, pp. 142, 165) Soon after the establishment of the ISI, Mahalanobis was approached by the IARI to come up with a reliable estimate of the area under different crops in Bengal. In response, Mahalanobis (1940, p. 512) wrote: ‘I suggested exploring the possibilities of a random sample method for estimating the area under different crops in Bengal.’
Consequently, in 1937, Mahalanobis started looking for a ‘satisfactory’ estimate of the yield and acreage of jute crops. (Rudra, 1998, p. 141; J. K. Ghosh et al., 1999, pp. 13–34, 22). When the report was published in 1946, it gave the figure of 7540 bales for jute production in Bengal. This was considerably more than the figure of 6304 bales arrived at by the colonial government. Mahalanobis’s number was very close to one that came from the customs and trade department, which gave a figure of 7562 bales, independently taking into account the whole produce (J. K. Ghosh et al., 1999, p. 23). Mahalanobis’ success not only stemmed from the correct calculation of jute production but also from its cost-effectiveness: The plot-to-plot enumeration undertaken by the government was ten times more costly and required fiftyfold more manpower. Despite pulling in huge resources, the official figure proved to be wrong. Mahalanobis, perhaps, became the first person to fully recognize the implication of joint consideration of sampling and non-sampling errors in statistical surveys, techniques which would be consistently applied in other large-scale surveys undertaken in independent India, particularly, when the National Sample Survey would be instituted in 1950 (Rudra, 1998, p. 142).
Mahalanobis was deeply inspired by the statistical works of Hubback. He was untiring in his appreciation of the latter’s work. Writing on sample surveys in the whirlwind days of 1946, when the departure of the British had already become imminent, and Hubback had already left for England, Mahalanobis tried to give the contemporary statistical community an idea of Hubback’s achievement. A civil servant, working in the far-off hinterland of colonial Orissa, Hubback realized the significance of the random sampling method when the concept was neglected by as august a body as the Bureau of the International Institute of Statistics, which had been functioning from Europe since 1885. 9 Hubback not only successfully took a divergent path but inspired a whole generation of Indian statisticians to build upon his methodologies, and, most importantly, bring new insights. Outside India, recognition of Hubback’s work came from acclaimed statistician Ronald Fisher, who at that time, was analyzing immense data collected since the 1840s from crop experiments done at Rothamsted Experimental Station (Royal Society, n.d.). In a memorandum sent to the ICAR, Fisher pointed out how Hubback’s work ‘influenced greatly the development of (his) methods at Rothamsted’ (Mahalanobis, 1946a, p. 269). Both Hubback and Fisher had very little patience for earlier ways of selecting typical or representative fields in crop-cutting work. However, in the last leg of colonial rule, the method’s use remained limited despite its accuracy, because of a lack of properly trained staff to undertake similar surveys across the country.
Food crops and statistics: Sukhatme leads ICAR in sample surveys
It was only with the Bengal Famine of 1943 that the colonial government realized the acute urgency of using the random sampling method for proper assessment of food production. The Inter-Departmental Committee, called by Viceroy Lord Wavell in 1943, instructed the ICAR to prepare a suitable technique of random sampling that could be used for conducting annual country-wide surveys for estimating the yield of major food crops. ICAR had a small statistical unit that had been established in 1930 at the recommendation of the Royal Commission of Agriculture (Government of India, 1928, p. 77). Demands for such a unit also came from agricultural scientists, such as Canadian pathologist Leslie Coleman, who worked as the first director of agriculture in Mysore State in southern India. Like many other agricultural experts of his time, Coleman was appalled at the dismal shape of statistical research in India. In his report, a frustrated Coleman (1929, p. 116) wrote: ‘I have yet to see a Season and Crop Report in which one who had even an elementary knowledge of local agricultural conditions could not find either glaring errors or at least figures so doubtful as to demand tracing back to their sources.’ As a solution, he called for the immediate establishment of a statistical section in the agricultural department ‘which can study critically the figures collected’. The primary responsibility of the statistical unit was to assist the State Departments of Agriculture and Animal Husbandry in planning their experiments, analyzing experimental data, and interpreting results along with advising ICAR on technical programmes. A decade after its establishment, the leadership of the unit went to statistician Pandurang Vasudeo Sukhatme. A student of Fisher’s, E.S. Pearson, and J. Neyman, Sukhatme played a leading role in developing statistical techniques that were relevant to different branches of agriculture. Sukhatme’s work on agricultural sampling was marked by the pioneering applications he made of the concepts of stratified sampling, the use of pilot studies and double sampling as well as the study of non-sampling errors, all of which he brought to bear on the 1943 random sample surveys (Som, 1998, p. 257).
The Council’s survey scheme was initiated with experiments on wheat in Punjab and Uttar Pradesh and was gradually extended to cover almost the entire wheat and paddy acreage in the states of Madhya Pradesh, Bihar, Orissa, Bombay, Madras, Assam as well as in the centrally administered areas of Delhi, Ajmer-Merwara, and Coorg. Each member of the staff in the Council’s survey team was supplied with a uniform set of equipment for his experimental work. This consisted of tape, a string, pegs, and cross-staff for marking the plot, gunny cloth for threshing the produce, scales with a set of weights for weighing it and bags for storing the same. The staff member located and marked, as per given instructions, a plot of a given size, usually one-eighth of an acre, in a field growing either wheat or paddy. The weight of grain obtained from the designated plot constituted a single observation in the sample. The weight of the grain was determined through a sequential process of harvesting, threshing, and winnowing the produce. Since the grain on the harvesting day contained moisture, making it heavier than the actual weight, it was stored in gunny bags and reweighed after drying. The process followed by officials to arrive at the accurate production figure was very similar to that followed by the cultivator himself, except that, after winnowing, the latter might often lay out the grain in an open field, leading to serious crop loss (Indian Council of Agricultural Research [ICAR], 1951, pp. 12–14). Commenting on the procedure of sample surveys of food crops, Sukhatme points out that ‘stratification’ should be the ‘first aspect’ requiring consideration. (ICAR, 1951, p. 18). Mahalanobis and Sukhatme both preferred stratified multi-stage sampling because it had been, by far, the most common device for ensuring a high degree of precision in estimating the production of any crop. As the rate of yield was known to differ widely from place to place, Sukhatme’s suggestion was to adopt the administrative unit of tehsil (or talukas) as the strata to start with. From the tehsil, two to six villages were identified for survey purposes, followed by the designation of two or three fields per village; eventually, one plot was taken for crop measurement from each of these fields. Each of these plots was considered as the representative sample for each stratum.
In the Survey Report submitted to the government, Sukhatme drew out a detailed comparative analysis of how his statistical methods in surveys conducted in 1944–49 yielded food crop acreage and production data that were substantially different from the existing official data. A comparison between the Survey and the official estimates of production done in states over the period 1944–45 to 1948–49 showed some dramatic differences. In Assam, for instance, the official overestimation of rice production ranged between 174.6 and 143.7 percent. States such as Bihar, Bombay, Madhya Pradesh, and Orissa, on the other hand, suffered from severe underestimation in the official estimates but not in the Survey. The available data showed that in 1944–45, the state showed 22.2 percent less production than the Survey, in 1945–46, it was 14.4 percent, in 1946–47, 8 percent less, and finally, in 1948–49 the difference was 9.5 percent.
While the ICAR’s survey team under Sukhatme helped to revise the production data, wheat continued to pose a challenge for the survey teams. This happened, the experts observed, because, unlike in paddy, annual fluctuations in yield for individual states were relatively large (due to diseases and pest attacks), making it a challenge to arrive at an average figure. In some cases, Council experts identified that the influence of seasons played a determining role in agricultural production. Often the survey teams found it difficult to carry on the estimation work, due to the practice of mixed cropping; it was common for the farmers to sow wheat with barley and gram. As each of these crops had different germination times, the surveyors could not practise the crop-cutting method efficiently.
With all the challenges of conducting the survey, Sukhatme was successful in implementing a stratified multi-stage sampling method to approximate the amount of food crop India was producing. The surveys conducted over a large swath of British India was a huge affair and its successful conclusion gave a strong boost to Sukhatme’s career, bringing him recognition and accolades. Sukhatme’s work in the 1940s put a limelight on ICAR as well. Under Sukhatme’s leadership, the Council was successful in creating awareness among agricultural scientists about the role of statistics in their research (S. P. Sukhatme, 1998, pp. 389–390). So by the time of India’s independence, the country had two academic bodies—ISI and ICAR—which were developing statistical techniques relevant to Indian agricultural conditions.
Small circle or large rectangle: The great debate among statisticians
In the years leading up to independence until the end of the Second Five-Year Plan (FYP) in 1961, agricultural production and acreage surveys in India mostly followed either the trajectory led by Sukhatme or the one led by Mahalanobis. The two trajectories involved two schools of thought that differed over the best method of crop-cutting experiments for the estimation of crop yields. Mahalanobis recommended the use of circular cuts of radius 4 feet for yield surveys, which Bengal and Bihar followed. As against this, experts at the Council had been using the rectangular cuts of 33 by 16.5 feet.
For Mahalanobis and Sukhatme, the size of the plots became a source of serious differences, which continued to simmer in the last half of the 1940s through the publication of the First FYP. Each side tried to prove the shortcomings in the other’s method while validating their preferred method regarding expenses, efficiency, and precision. Using statistical data, they challenged each other’s work, and the exchanges spilled onto the pages of Nature, internationalizing the debate. On January 29, 1946, in a letter to Nature, Sukhatme criticized Mahalanobis’ decision to use plots of less than 13.6 sq. ft area to conduct crop-cutting experiments. Sukhatme was bewildered as to why Mahalanobis would decide on a small size of plots, as the data already collected from surveys carried in the Moradabad district of United Provinces unequivocally demonstrated that there was ‘serious over-estimation of yield’ in fields measuring 30 sq. ft or less. In the letter, Sukhatme mentioned that data collected from eight different-sized plots revealed that while the ‘bias’ diminished with an increase in a plot’s size, even plots as big as 118 sq. ft continued to show more crops than the amount farmers produced. So, for ICAR statisticians, the Moradabad survey, which was carried over 2,288 square miles, demonstrated that ‘small size plots most probably lead (sic) to biased results’ (P. V. Sukhatme, 1946, pp. 157, 630). The publication of Sukhatme’s letter in Nature elicited a sharp response from Mahalanobis, who pointed out that there was nothing new in the former’s observation. Mahalanobis retorted that already in 1940 the issues of over-estimation with small plot sizes were fixed, after being reported by the statisticians of ISI themselves, and since then a ‘good deal of work’ on the subject had been done in ISI (Mahalanobis, 1946b, pp. 798–799). Apart from the two-page rejoinder sent to Nature that came out in the fall of 1946, Mahalanobis published a dozen-page long article in the Sankhya—the renowned statistical journal of ISI—outlining the history, rationale, challenges, and, most importantly, the significance of the sample survey methods he followed in Bengal to assess the crop yields of the province. In both pieces, Mahalanobis brushed aside Sukhatme’s explanation that a ‘bias’ in small-size plot occurred because of the ‘human tendency’ to include plants at the border of the field within the plot itself, particularly so when the perimeter of the plot was large in proportion to its area. His counterargument to Sukhatme’s assertion on human tendency was not to deny that such tendency existed but to show that in the case of circular cuts, which he had been employing, there could be both over- and underestimation. Therefore, on the whole, Mahalanobis wrote that when the work is done by a ‘large’ number of different investigators, ‘these tend to cancel out so that results obtained with the circular cut … appear to be without bias’ (Mahalanobis, 1946a, p. 274).
Mahalanobis drew on precedents where the small-size plot estimation method had been applied successfully. He also borrowed from Hubback’s observations in the matter to argue his case. In his Sankhya article, the ISI founder repeatedly quoted Hubback, such as on how he saw ‘no advantage’ in taking a large number of samples from places very close together, as the crops would naturally be very much the same on the same day and, again, ‘the degree of accuracy is not seriously improved by taking large samples’; he quoted Hubback’s advice that ‘there is no need to take large samples’, and that instead ‘handy samples’ obtained by random sampling would be equally helpful (Mahalanobis, 1946a, p. 270). The statistical work of Deshmuk and P.S. Rau also helped to bolster Mahalanobis’s confidence in the precision of the small-size plot estimation method. To get a triangular cut in an area of only 1/3200 of an acre, Hubback had used a special apparatus. A decade later when Mahalanobis used the same idea for his circular cut. The instrument was designed by Mr J.M. Sengupta, a former Physics student of Mahalanobis from the Presidency College, who after independence became part of the National Sample Survey Team under his former teacher.
The quality of Mahalanobis’s work would be recognized by none other than R.A. Fisher, with whom he would develop a life-long friendship. Reminiscing on his friendship with Fisher, Mahalanobis wrote about the latter’s profound intellectual influence on him.
I believe I can claim to be the first convert to the Fisherian view of statistics; I have also tried to extend his ideas to the design of sample surveys. For me, the discovery of Fisher, nearly forty years ago, was an important factor in deepening my interest in statistics which was further strengthened by the impressions of the memorable day I spent with him at Rothamsted Agricultural Station in 1926 when I met him for the first time. (Mahalanobis, 1964, p. 370)
From 1937 to 1962, Fisher visited India eight times, always staying with Mahalanobis’s family and closely interacting with the ISI students on theories and practical applications of statistics. 10 ‘The special needs of an underdeveloped country like India’, observed Mahalanobis, ‘had made it continually necessary for us to increase the scope of application of statistical methods in widely differing subject fields in natural and social sciences, technology and economic planning’; in such developments, Mahalanobis repeatedly acknowledged receiving the support of Fisher (Mahalanobis, 1964, p. 370).
Mahalanobis’s close associates had also put on record how the British statistician, in the 1950s, provided him with crucial support in defending the ISI approach to statistics, as it had been suffering ‘bad weather’ in Government circles, particularly with the ICAR (Rudra, 1998, p. 298). Mahalanobis’s deep connections with Fisher and CD Deshmukh would help ISI to grow as the hub of statistical research in India and place it at the heart of development planning; Mahalanobis gratefully remembered that it was Fisher who first formulated in a precise way the concept of the Indian Statistical Institute as a higher technological institution comparable, although on a much smaller scale, to that of Zurich Federal School of Technology or the Massachusetts Institute of Technology. Mahalanobis felt that ‘Ronald Fisher had exercised more influence than anyone else in the shaping of the policy and programme of the Indian Statistical Institute’ (Mahalanobis, 1964, p. 370).
Of particular help to ISI and Mahalanobis was having the President of ISI, C.D. Deshmukh, as the first Finance Minister of independent India during the crucial First Plan period (Menon, 2022, pp. 43–57) 11 When the institute was rocked by a series of financial and administrative crises, it was Finance Minister Deshmukh who helped ISI maintain its stability (Indian Statistical Institute, Kolkata, n.d.) Deshmukh and Fisher both shared with Mahalanobis the vision that the institute set out to realize: To emphasize the ISI’s underlying unity of purpose, namely the promotion of research and knowledge, Deshmukh had drawn the ‘Banyan Tree’—the emblem of the Institute that was adopted just before the first convocation in 1962. Incidentally, it is Fisher who would refer to ISI as the ‘Banyan Tree’ (Indian Statistical Institute, n.d.).
Accuracy, economy and efficiency: Revenue structures and statistical surveys
Though the pieces of textual evidence of the debate and discussion over the size of sample plots revolved primarily around Sukhatme and Mahalanobis and their institutions ICAR and ISI, the root of the issue had been enmeshed in a wider social and material context, particularly, the British land tenure system. Prasant Mahalanobis and Pandurang Sukhatme’s preference for small and large cut plots had to do with the kind of agricultural revenue structure within which they operated.
In the areas of Bengal, Bihar, Orissa, and Central Provinces, where Hubback, Deshmukh, Rao, and Mahalanobis had been conducting surveys, the British had introduced the ‘Permanent Settlement’ system (1793), fixing the landlords’ (zamindars) revenue commitments to the government in perpetuity. In this system, the colonial state first assumed to itself and then transferred to zamindars the proprietary right to the soil (Bose, 1993, p. 70). 12 Nationalists, commenting on the colonial agrarian condition, saw in the permanent settlement a thorough disempowerment of the cultivators, who had no right to hold land against the will of the government (Chandra, 2016, p. 199). 13 The anti-colonial approach, however, ignored the possibility that the domination of landlords was accompanied by simultaneous attrition of state capacity (Baden-Powell, 1892, p. 402; Lee, 2019, p. 413). 14 In settling the revenue demand permanently with the landlords, the administration became dependent on them for not only collecting taxes but for all data on crop production, acreage, yield, and crop loss, as well as for improvement measures (Mitra, 1898, p. 91; Ray, 1979, pp. 6–7; Sinha, 1962, pp. 18, 27). 15 As revenue demands could not be increased in these areas for any growth in crop production, such data was of little use to the government, making administrators indifferent to its significance. The zamindars also found it ‘distasteful’ to have details of authorized rents and rights of cultivators included in village records. The village patwaris were instructed to maintain only such accounts as the zamindar wanted for his own benefit. 16 Such was the extent of subservience of the patwaris in the permanently settled areas that Baden-Powell referred to them as the ‘bond-slaves’ of the zamindars (Baden-Powell, 1892, p. 284). Thus, when Mahalanobis had to undertake surveys in villages belonging to such areas, he had to manage with the help of personnel who were not employees of the state but recruited from outside. The situation posed challenges in terms of manpower availability and financial resources. The history of surveys carried out under Mahalanobis indicated that he tried to meet this fundamental challenge by devising statistical methods appropriate for working within the constraints without compromising accuracy. While the Bengal Famine Commission (1943) recommended the establishment of an agency similar to the Patwari System of the Ryotwari area for carrying on the survey work, Mahalanobis employed trained investigators to work full-time under the direction of ISI, Kolkata (Mahalanobis & Sengupta, 1951, p. 360).
Unlike his later articles, the ones written after independence that dealt with intricate mathematical details in relation to the plot size, earlier articles of Mahalanobis gave insights into structural issues that played a role in the technical choices. For instance, Mahalanobis wrote in Sankhya that with small cuts of the size used by Hubback, Deshmukh, and Rau, as well as by him, the number of sample cuts could be increased while reducing the number of villages required to be surveyed. Deshmukh’s work over 1928–30 showed that by bringing down the number of villages from 434 to 184 and by reducing the cuts from one to five, the survey became much more economical without compromising the precision of the results (Mahalanobis, 1946a, p. 273). The economy was achieved by cutting down on the costs incurred from traveling from one village to another.
In the past, the observers in the permanently settled areas had to travel to a specified village and a specified field. Only then could they select a point on the field, demarcate a sample cut of the required size, follow it by harvesting plants and processing the crops, and eventually weigh the yield on the spot. Long distances, impassable roads, and unavailability of transport were integral parts of the survey work. ‘I toured intensively through five seasons, in all covering about five to six hundred villages each year’, writes Deshmukh in his memoir, recollecting his settlement days as the Settlement Commissioner of the Raipur District. The rice fields were muddy in November-December, ‘dry and cracked and cloddy’ in the hot weather, the roads had to be improvised, and traveling had to be done on foot in combination with car, horse, bicycle, and bullock tonga. Circumventing all these challenges, Deshmukh managed to usually inspect six to seven villages in the forenoon and three to four in the afternoon (Deshmukh, 1974, pp. 76–77).
Thus, for surveyors working in the permanently settled districts small-cut plots came as a boon, helping to reduce the number of villages to be surveyed and, thereby, the travel required of a small coterie of observers at their disposal. The long travel, diligence, and alacrity required in such survey works could only be made possible if the investigators could collect the large number of samples ‘without much trouble or fatigue’ (Mahalanobis, 1946a, p. 270). Hubback, from his personal experience of surveying the Santhal Pargana—the tribal territory in the current state of Jharkhand in eastern India—observed that when the fieldwork became ‘strenuous and tedious’ there was an increased probability of survey practices being ‘shirked and results fudged’ (Hubback, 1946, p. 287). Mahalanobis even recommended ‘stationary investigators’, who could carry on the work in the neighbourhood of their normal residence. In fact, he saw in it a ‘distinct gain in precision’, as each investigator, because of the small area they had to cover, would be able to collect at the same time a much larger number of sample cuts. Theoretically, at least, two to three thousand sample cuts properly randomized in both time and space should be able to supply a mean value of the yield per acre with a percentage error of the order of 1 percent, which would be amply sufficient for all practical purposes, Mahalanobis surmised (Mahalanobis, 1946a, pp. 274–275).
In Bengal, the colonial administration determined the revenue amount without any area survey. However, in doing so, they neglected a numerical understanding of fertility, productive power, or of acreage. The Ryotwari Settlement was far better positioned in these matters, so much so that Baden-Powell suggested that these areas might be more correctly described as a ‘Survey-Assessment’. 17 The terms of the Settlement were as true of Bombay as they were in Madras. Centered around the ryot or cultivator, the system made revenue settlement directly with the individual tilling the land. In these areas, an extensive cadastral survey of the land was done and a detailed record of rights was prepared, which served as the legal title to the land for the cultivator (Banerjee & Iyer, 2005, p. 1193). The colonial state operated at a much higher capacity in non-zamindari areas, with a better ability to extract money and access information on rural life.
Among the village officers who assisted the state in running the mechanism of operation at ryotwari or mahalwari areas, the institution of the patwari was of utmost importance; serving as the village accountant and registrar, he was entrusted with the task of recording the area of land cultivated by each household, the nature of crops sown and the amount of revenue to be collected. 18 These records were kept in patwari documents, such as Milan Khasra (comparative statement of cultivated, cultivable, and barren land over the years), Naqsha Jinswar (statement of crops based on each harvest), and Naqsha Bagat (statement of groves and orchards) to name a few. The maintenance of the village maps, showing the size and division of fields, even changes in the length of roads, drains, and wells, was also one of the primary duties of patwaris (Husain, 2017, pp. 179–184).
The ICAR statisticians were aware of the challenges of surveying the areas that had none of the detailed recordings of land and produce that existed in Ryotwari land. In his report on ICAR surveys, Sukhatme wrote that the acreage statistics in the permanently settled areas are of ‘doubtful reliability’, since in these areas an ‘extensive agency’ for field-to-field enumeration of crop acreage along with provision for adequate supervision did not exist. (ICAR, 1951, p. 195) He discussed the existing ‘elaborate’ revenue agency of patwaris reaching ‘the remotest villages’, and how such officials maintained ‘yearly statistics of crop acreage by field to field inspection’. Sukhatme’s confidence in the ‘elaborate Government organization extending to the remotest village’ was shared by two other government statisticians, V.G. Panse and R.J. Kalamkar, associated with the Institute of Plant Industry at Indore and the Department of Agriculture of the Central Provinces respectively (Panse & Kalamkar, 1944, p. 121).
Panse and Kalamkar started working together with the crop-cutting experiment they undertook on cotton in the Bombay Presidency around the mid-1940s. Eventually, the two statisticians would end up summarizing the result of their work in the pages of Current Science, the well-known journal of the Indian National Science Academy. Their observations further bolstered Sukhatme’s argument about the benefits of keeping the size of sample plots large and the need to work within the existing administrative system rather than hire trained personnel for surveys. In the two papers that Panse wrote with Kalamkar, the two statisticians argued that if an objective method of crop estimation was to be successfully introduced as a regular annual feature, the method should be developed in such a manner as would fit into the existing administrative machinery with the minimum essential changes in the official procedure. They therefore proposed to use personnel from the Land Records Department as field staff while additional district staff was recruited from the revenue and agricultural departments who were given training in the course of the fieldwork. For the entire Akola district in Maharashtra, where they were surveying cotton crops, they randomly selected 204 villages in all and decided that a single plot of one-tenth of an acre would be harvested per village (Panse & Kalamkar, 1944, p. 125). It is worth remembering here that the sample plot size that Hubback recommended was 1/3200 of an acre. Both Hubback as well as Mahalanobis found cuts of one-tenth of an acre used in Government Agricultural Departments as ‘entirely unnecessary’ (Mahalanobis, 1946a, p. 274).
Miscalculating yield or area? Confusion in crop count
Amidst the debate over the statistical plot sizes, agricultural statistics as a discipline consolidated itself in the decade following India’s independence: Institutional growth was matched by the strengthening of international networks, and the country’s pioneering statisticians received global recognition. Growth of the discipline, however, did not imply that statistics was effectively used to understand India’s food crisis. This was a serious drawback for which the statistical community and the Indian government had to face sharp criticisms from all around. The existing data was insufficient and unreliable: ‘on the basis of available data’, experts pointed out in the First FYP, ‘it is not possible to reach any definite conclusions’ on the magnitude of food deficit. In the Parliamentary Debates of March 1950, the Agricultural Minister took an all-round rap on the terrible condition of the data. Mukhtiar Singh Malik, the then General Secretary of the Punjab Zamindara Party, lashed out at the ‘hopeless condition’ of agricultural statistics. 19 Mukhtiar joined Dr. Pattabhi Sitaramayya of the ruling party and demanded a categorical settlement on the food question that would delineate the aggregate amount of production, outline the quantity that needed to be procured from farmers, and inform the country of the volume of food crops that had to be imported for that year. 20 Staunchly averse to importing food, many members of the Indian Parliament felt that the issue could be decisively settled if the Ministry of Food and Agriculture came up with reliable data on the matter. 21 In the absence of the requisite data on production, senior Members of Parliament sometimes refused to believe that there were actually shortages of food in the country. Dr. Pattabhi ranted, ‘Your dearth of food is a theoretical idea, is a fantastic idea. It is not there. There is enough food in the country. Leave the people alone. Do not try to organize the lives of the people. Try to restore normal conditions.’ 22 For politicians like Pattabhi to have the agricultural sector in order was very important, as farming could provide employment for millions of Indians who could not otherwise find jobs in industries (Pattabhiram, 1984, p. 121). The Government of India, therefore, needed to have accurate data on food production not only to have an understanding of the ‘deficit’ but also to convince others of the need to have a methodical, coordinated, rational and calibrated approach to the problem of food shortages as well as bring about a planned agricultural development.
To facilitate data collection, the Indian government approved the National Sample Survey (NSS) in January 1950. It was to conduct continuous biannual urban and rural surveys that would shed light on the socio-economic, agricultural, and industrial trends of a population of over 360 million people. The NSS Committee comprised of Mahalanobis, economist Dhananjay Gadgil of Gokhale Institute of Politics, and Economics in Pune and bureaucrat P.C. Bhattacharya. 23 As far as the agricultural sector was concerned, the NSS estimates of crop production were based on land utilization surveys and crop-cutting experiments that had been part of the multi-subject socio-economic surveys under the technical guidance of the ISI. 24 The ICAR was previously responsible for providing technical advice and for supervising the fieldwork of crop-cutting surveys in all states, except Kerala, Orissa, and West Bengal. From 1953 onwards, these functions came to be discharged by the Agricultural Statistics Division (ASD) of the Directorate of NSS. Senior employees of ISI-Calcutta such as Debabrata Lahiri were in charge of sample design, along with Satyabrata Sen and Nimai Charan Ghosh for statistical work. The NSS series of area and production data were available for seven principal cereals— rice, wheat, jowar, bajra, maize, ragi and barley—and referred to as ‘NSS estimates’. The NSS estimates of the area under the crop were obtained by direct physical observation of the land utilization from sample clusters of plots in sample villages. The yield rates were obtained by crop-cutting experiments in a sub-sample of plots by harvesting the crop in circular cuts of a radius of 4 feet just as the researchers had done in the earlier period.
The second source of agricultural production data was the State governments, which furnished these figures to the Ministry of Agriculture. With this data, the Ministry compiled the all-India crop estimates referred to as the ‘official estimates’. The official estimate of production of any of the seven major cereals was obtained as the product of the area under the crop and the yield rate. The area under the crop was ascertained in all states almost wholly by field-to-field enumeration by the state revenue agencies, according to procedures laid down by the Department of Revenue. 25 To calculate the yield rate of the crop, the ‘official estimate’ conducted crop-cutting experiments by random sampling technique. For survey purposes, all states (except Orissa and West Bengal) followed the method of crop-cutting experiments that was developed by the ICAR, in which the sample harvest was generally made on rectangular cuts of size 33 x 16 ½ feet. Orissa and West Bengal, on the other hand, followed the technique developed by ISI, using circular cuts of radii 4 to 5.6 feet respectively (Sen et al., 1967, pp. 1–2).
Since the inception of the NSS series, however, the official and NSS estimates of production for the seven cereals exhibited ‘substantial’ differences from each other. The magnitude of this difference was much larger from 1957–58 to 1961–62 when it ranged between 15 and 24 million tons. The difference narrowed down to 5 to 8 million tons from 1962–63 to 1965–66 (Sen et al., 1967, pp. 2–3). 26 The large divergence between the two sets of estimates called for a critical examination of the two series and the possible sources which might have contributed to the difference.
Beginning in May 1959, the large difference between the two series of crop estimates came up for consideration in several technical meetings. In the wake of these discussions, the Central Statistics Office (CSO) set up a working group consisting of the representatives of the Ministry of Food and Agriculture, ISI, and NSS to examine the question further. It also consulted Dr. Frank Yates on the subject. Several possible factors were listed as being responsible for the high range of differences between the two estimates, among which the size of the cut used for crop-cutting experiments figured prominently. The Committee expressed the view that a special study under the joint technical supervision of the Ministry of Food and Agriculture, CSO, and ISI would be necessary to examine the influence of the size of cuts on the estimate of yield rate. It recommended the use of both types of cuts in the same set of fields as well as the harvesting of whole fields.
The wide margin of difference between the two series of crop estimates came up for discussion at a meeting of the Planning Commission in September 1960, presided over by Prime Minister Jawahar Lal Nehru, which decided that a committee of senior officials should look into this matter. On the 9th of September, 1960, that committee decided that: First, the CSO should suggest a programme of technical studies which would help to identify the reasons for the discrepancy. Second, the NSS land use and crop-cutting surveys should continue on an enlarged scale for a period of three years to provide firm estimates at the all-India, State and regional levels.
The Technical Committee was chaired by S.R. Sen, the Chair of the Planning Commission, and represented the Department of Agriculture, the NSS, and ISI. It held eight meetings from January 1963 to August 1967. Over this period, the scheme of type-1 studies envisaged a comparison of yield rates based on circular cuts of radius 4 feet and rectangular cuts of size 33 by 16 ½ feet from the same set of fields. The first type-1 study was conducted on jowar in November-December 1960, in two villages near Bundi in the state of Rajasthan in western India and the second type-1 study was carried out in 1961 on wheat and Barley at Barh in Bihar. Technical details of both the studies were drawn up jointly by the ISI, NSS, and the State governments concerned in consultation with the CSO. In 1963-64, the type-1 studies were followed by the type-2 studies. The latter envisaged joint crop-cutting experiments by both techniques on a larger scale than in the type-1 studies and on all principal cereal crops. One major difference between the two studies was that the type-2 studies were to be organized under normal field conditions with the same scale of supervision. Unlike in the case of type-1 studies, these studies were carried out under the general guidance of the Technical Committee, according to the design, concepts, and definitions laid down by the Committee. The data were independently analyzed by the ISI, NSS, and Institute of Agricultural Research Statistics (IARS) which was established in 1959 and worked within the ambit of ICAR. 27
The Technical Committee considered the evidence of the studies ‘to be the best that could be obtained under the present circumstances’. According to the Bundi (type-1) study, the difference between the yield rates based on the circular and State cuts was negligible and was not statistically significant. In the Barh (type-1) study, the divergence between the cuts was found to be not statistically significant according to the analysis presented by the NSS and ISI. According to the IARS, the differences in yield rates between circular and rectangular cuts were significant at a five percent level. It is evident that the NSS and ISI had disagreements with IARS over the interpretation of data collected through the studies. There was common agreement, however, in the Barh study, that the NSS tended to overestimate the yield rate. Magnitudes of difference were, however, small in relation to the observed standard errors. The results of type-2 studies which were available for four different crops, one each from four states with diverse field conditions, did not bring out any marked difference in the yield rates with the two types of cuts. The agreement between the two estimates was quite close, and going by the evidence of these studies alone either cut was considered as good as the other in terms of bias. This conclusion was strengthened by the fact that the difference in yield rates was not always in the same direction, being positive in the case of two crops and negative in the case of the other two. The conclusion that the Technical Committee reached from the type-1 and type-2 studies was that the large divergence observed between the NSS and official series of production was not ascribable to the difference in the type of cut adopted in the respective series.
Once the Technical Committee ruled out differences in the type of cut as the crucial factor causing divergence between the two series produced by NSS and the official estimates, it argued that the difference must be stemming from wide divergences visible between the trends of the two series of area estimates: In the official series, the practices followed for recording net areas happen to be different from state to state and sometimes even within a state. The Technical Committee felt that the procedures laid down in the Land Record Manuals for recording net areas were not carefully followed. Another factor influencing the official estimates of the area could be, the Committee pointed out, the cumulative effect of errors in recording, aggregating, and transfer of area statistics from the village level up to the state level. Once the Committee was confident that the divergence in crop production data was caused by errors in calculating and recording the crop area, it urged the following steps: ‘well-considered’ concepts and procedures for recording areas under mixed crop cultivation, adequate training of the primary staff, and supervision over their work.
Having pointed out the work that still needed to be done to root out the errors of the official estimate, the Technical Committee, however, categorically stated that it was important that all possible measures should be taken to improve the quality and timeliness of official series of crop statistics, ‘as no other series could ever completely replace the official series’. Moving away from the great debate over the sample cut size, the Technical Committee rather emphasized accurate procedures of compilation and estimation, uniformity in concepts and definitions to ensure inter-state comparability, and ‘intensive’ supervisory’ check. Thus, unlike in the past, the goals of accuracy and efficiency were not seen as part of the revenue structure within which the surveys operated but such goals were seen as realizable through maintaining basic procedural hygiene.
The observations of the Committee brought an end to a long debate between the two pioneer statisticians in India and the institutions of which they were a part. The biggest blow to NSS and Mahalanobis came with the Committee observing that if their recommendations were accepted and implemented, there would be ‘no need’ for continuing the NSS series of crop estimates. The Committee proposed that the NSS staff, who came with considerable experience in sample surveys, could be used for checking the quality of samples collected by state agencies. The representatives of the ISI in the Committee disagreed, for obvious reasons, with the proposal of having just one series for the country. By the 1970s, serious concerns about political pressure were being raised by scholars studying India’s agricultural statistics; they pointed out that ‘there seems to have been an attempt on the part of the Ministry of Agriculture to vet the final estimates as given by the States before publishing all India figures’. Furthermore, it was argued that these attempts went beyond any adjustments made to reach reliable estimates, and were instead made ‘to arrive at an estimate by negotiation on the basis of what are essentially preconceptions as to the impact of increased absorption of various inputs including high yielding varieties, is on much shakier ground’ (Srinivasan & Vaidyanathan, 1972, p. 36).
Conclusion
The challenges facing pioneer statisticians, such as Sukhatme, Panse, and Mahalanobis, were manifold. Statistics was an emerging field with a vast and diverse array of methods being tried in different parts of the world, and they were expected to choose the methods and design of experiments that were appropriate to India’s complex social and economic realities, and provide accurate answers for its development regime without taxing its exchequer. With the international statistical community, of which they were significant active participants, they exchanged ideas, experiences, and research results—gaining in so many ways from the network. However, the decision about how to deal with the challenges of India’s development problems lay with them. For a fledgling community and its star-studded leaders, this was both an enormous opportunity as well as a challenge. The opportunity came in the form of freedom of initiative; one can do so much when the process is either about to unroll or has just started, and the inertia of stability has not kicked in. It was a dynamic situation, one that allowed a proactive person like Mahalanobis to seize opportunities, liaise with the political establishment, attract the best of students, and make significant contributions in the areas of multivariate analysis, design of experiments, and combinatorial mathematics while facing serious challenges of implementations. However, the challenge of having such an unstructured situation was that there emerged serious differences among leading practitioners of the disciples, as happened between Mahalanobis and Sukhatme, as they grappled with funding bodies for resources and political establishment for official acceptance. In terms of resources, India was a small boat, and thus power struggles often underlined cognitive differences, as happened in the case of designing crop surveys.
The discourse around sample plot size can be organized around an axis whose extremes were populated by ideas, such as necessary-unnecessary, efficiency-inefficiency, economical, and uneconomical. Such extremities of position about a statistical method reveal the challenges of coming up with a universally accepted design that performed entirely on objective parameters. Despite the efforts of Indian statisticians to make the messy, complex, and vast world of agricultural production amenable to mechanisms of human calculation, the process threw up challenges, revealing how much variance there could be among statisticians over the correct methods to follow in measuring nature. There was nothing perfectly objective about the methods; rather the methods evolved within certain socio-economic structures, though the proponents of respective sample plot sizes often posed themselves as the corrective of the other.
From the fragmentary, revenue-driven approach of the British Raj, the statistical surveys of crop production under the national government took a consolidated form. The independent Indian state saw statistics as a key tool of good governance, the way to build a modern and progressive country. The vastly amplified significance of statistics required that the debate over the sample plot size reach a closure which was, however, hard to find. Simply taking recourse to numbers did not help in the validation of one statistical method over the other. The inherent power asymmetry among social groups that often helps to decide the case in favour of a particular design was absent in this case, as the proponents who actively participated in the debate over the sampling plot were powerful actors, enjoying tremendous social status and high standing in the academic world (Klein & Kleinman, 2002, p. 39). The political leadership of India courted them and placed them in significant positions. Thus, after independence, new rules of play emerged which clearly demarcated the areas of operation as per the distinctive resource distribution, capacities, and incapacities (Klein & Kleinman, 2002, p. 35).
Accordingly, Mahalanobis was made the Honorary Statistical Adviser to the Cabinet, giving the technical guidance necessary for the operation of the newly established central statistical unit. The unit came with a huge scope of work, as it was charged with the responsibility of making a systematic collection of data in connection with myriad problems associated with the social, economic, cultural, and scientific development of society. In addition, based on Mahalanobis’s work in Bengal and because of his strong advocacy, the National Sample Survey was established for the collection of socioeconomic data. It was to provide the information for administrative purposes and the computation of national income. The constitution of NSS was the ultimate realization of Mahalanobis’s strong belief in the social role of statistics, and its indispensability for statecraft (Rudra, 1998, p. 136).
Footnotes
Acknowledgements
The author is thankful for invaluable inputs from the four anonymous referees of SSS, editor Sergio Sismondo, and Ms. Gursimran Butalia for being the most fantastic Research Associate.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
