45 min read·The nature, source and purpose of management information

Sources of data

Learning outcomes

  • Describe the three main data sources: machine/sensor, transactional and human/social.
  • Describe and identify sources and categories of information including internal, external, primary and secondary.
  • Explain the uses and limitations of published information/data (including information from the internet).
  • Identify the data capture costs of management accounting information.

Objective a: Describe the three main data sources: machine/sensor, transactional and human/social.

In the digital age, organizations harvest data from a vast array of sources to feed their management information systems. These sources can be broadly categorized into three main types: machine/sensor data, transactional data, and human/social data. Machine/sensor data is generated automatically by equipment, IoT (Internet of Things) devices, and computer systems without human intervention. This data is typically characterized by its high velocity and massive volume. Examples include GPS coordinates from delivery trucks, temperature readings from factory machinery, or server log files. It provides real-time, objective insights into operational efficiency.

Transactional data is generated by the day-to-day business operations and commercial exchanges of the organization. Every time a customer buys a product, a supplier is paid, or an employee logs their hours, a transaction is recorded. This data is highly structured and forms the backbone of traditional financial and management accounting systems. Examples include sales invoices, purchase orders, and payroll records. It is essential for tracking revenues, costs, and inventory levels.

Human/social data is generated by people, often through their interactions on social media, emails, surveys, or customer service calls. Unlike transactional data, human/social data is largely unstructured and qualitative. It includes customer reviews, tweets about a brand, or feedback from employee focus groups. While harder to analyze than numbers, it provides vital context regarding customer sentiment, brand reputation, and market trends. Consider 'SkyCourier', a drone delivery network. SkyCourier relies on all three: Machine data from the drones' altimeters ensures they don't crash; Transactional data from the billing system tracks revenue per delivery; and Human/social data from Twitter helps them identify if customers are complaining about noisy drones in specific neighborhoods.

Key Point

Data Velocity and Structure

Machine data is usually high-velocity and automated. Transactional data is highly structured (numbers in databases). Human/social data is often unstructured (text, images, emotions) and requires advanced analytics to interpret.

SkyCourier: Integrating Three Data Sources
  1. 1

    Step 1: Harvesting Machine/Sensor Data

    SkyCourier's fleet of delivery drones continuously transmits telemetry data: battery drain rates, wind resistance, and GPS location. The management accountant uses this automated sensor data to calculate the exact electricity cost per mile flown in different weather conditions.

  2. 2

    Step 2: Analyzing Transactional Data

    Simultaneously, the company's e-commerce platform records every customer order, payment method, and delivery fee. This structured transactional data allows the accountant to generate daily revenue reports and identify which delivery zones are the most profitable.

  3. 3

    Step 3: Interpreting Human/Social Data

    The marketing team scrapes local social media groups and finds a surge of unstructured human data: angry posts about drones waking up babies in the 'Northwood' zone. Management combines this social data with the transactional data to decide to restrict Northwood deliveries to afternoon hours, preserving brand reputation.

Modern management accounting requires synthesizing all three data types. Relying solely on transactional data while ignoring machine efficiency or human sentiment leads to a dangerous blind spot.

Practice Question

Which of the following is the best example of machine/sensor data?

Practice Question

Data generated by the day-to-day commercial exchanges of an organization, such as purchase orders and payroll records, is classified as:

Practice Question

Why is human/social data often more difficult for management accountants to analyze than transactional data?

Objective b: Describe and identify sources and categories of information including internal, external, primary and secondary.

Information used by management can be categorized by its origin (Internal vs. External) and by its original purpose (Primary vs. Secondary). Internal information originates from within the organization itself. Examples include the company's own sales records, production timesheets, and employee performance reviews. It is usually highly relevant, easily accessible, and confidential. External information originates from outside the organization. Examples include government inflation statistics, competitor pricing data, and industry market research reports. External information is crucial for strategic planning, as a company cannot survive by only looking inward.

Independent of where it comes from, information is also classified as primary or secondary. Primary information is data collected specifically for the specific problem or task at hand. It is 'first-hand' data. For example, if a company commissions a bespoke survey to ask customers about a brand new product prototype, that is primary data. It is highly relevant but often expensive and time-consuming to gather.

Secondary information is data that has already been collected by someone else (or by the company itself at an earlier date) for a different purpose, but is now being reused for the current problem. It is 'second-hand' data. For example, looking up national census data to estimate the population of a city before opening a store is using secondary data. Consider 'CulturedCuts', a lab-grown meat producer. To decide on pricing, they use Internal Secondary data (past sales of their older products), External Secondary data (a published UN report on global meat consumption), and External Primary data (hiring a firm to conduct blind taste tests of their new lab-grown steak against a competitor's).

Definition

Primary vs Secondary Data

Primary Data: Collected firsthand specifically for the current purpose (e.g., a custom survey).
Secondary Data: Already exists; collected previously for another purpose but reused now (e.g., reading a published industry report).

CulturedCuts: Categorizing Information Sources
  1. 1

    Step 1: Sourcing Internal Secondary Data

    CulturedCuts wants to launch a new line of lab-grown chicken nuggets. The management accountant first looks at the company's own sales database from last year to see how well their lab-grown burgers sold in winter. This is Internal (from within the company) and Secondary (collected originally for last year's accounting, not specifically for this new nugget launch).

  2. 2

    Step 2: Sourcing External Secondary Data

    Next, the marketing team downloads a free government report on national poultry consumption trends over the last decade. This is External (from the government) and Secondary (the government collected it for national statistics, not for CulturedCuts' specific product launch).

  3. 3

    Step 3: Sourcing External Primary Data

    Finally, realizing they need specific feedback on the nugget's texture, they hire an outside market research agency to conduct focus groups with 100 consumers. This is External (conducted by an outside agency/consumers) and Primary (collected specifically and exclusively for the purpose of evaluating this exact nugget prototype).

Managers must blend these categories. Primary data is perfectly tailored but expensive; secondary data is cheap but may lack specific relevance. Internal data is accessible, but external data provides vital market context.

Practice Question

A company accesses a database of national demographic statistics published by the government to help decide where to open a new retail store. How should this information be categorized?

Practice Question

Which of the following is the best example of Primary data?

Practice Question

What is a major disadvantage of relying solely on Primary data for decision-making?

Objective c: Explain the uses and limitations of published information/data (including information from the internet).

Published information, particularly from the internet, is a vast and easily accessible source of external secondary data. Organizations use published data extensively for benchmarking, market research, and strategic planning. Common sources include government statistical websites (e.g., census data, inflation rates), financial news outlets, industry trade journals, and competitor websites. The primary use of this data is to provide macroeconomic context. For instance, a company can use published exchange rate forecasts to hedge currency risks, or use demographic data to identify emerging markets without having to spend millions conducting their own primary research.

However, the limitations of published information, especially internet data, are severe and must be carefully managed. The most critical limitation is veracity (accuracy and truthfulness). The internet is saturated with unverified, biased, or deliberately misleading information. A competitor's press release may exaggerate their market share, or a blog post might contain flawed statistical analysis. Furthermore, published data is often outdated by the time it is compiled and released. Finally, published data is generic; it is available to everyone (including competitors) and is not tailored to the specific strategic nuances of your organization.

Consider 'ThreadLogic', an AI-driven fashion trend predictor. ThreadLogic uses published internet data by scraping millions of public social media posts and fashion blogs to predict next season's colors. The use is clear: it provides a massive, free dataset of global sentiment. However, the limitations are significant. The data suffers from selection bias (only certain demographics post about fashion online), it may be manipulated by paid influencers (lacking veracity), and because it is public, ThreadLogic's competitors can scrape the exact same data, eliminating any unique competitive advantage.

Examiner Tip

Bias and Verification

Examiners often test your critical thinking regarding internet data. Never assume published data is neutral. Always consider the source's motivation (e.g., a trade association might publish data that makes their industry look better than it is). Verification is essential.

ThreadLogic: Navigating Internet Data Limitations
  1. 1

    Step 1: Identifying the Data Source

    ThreadLogic's management team wants to forecast the demand for sustainable fabrics. They find a comprehensive, free report online titled 'The Future of Green Fashion', published by a consortium of organic cotton farmers.

  2. 2

    Step 2: Assessing the Limitations (Bias)

    The management accountant reviews the report and flags a major limitation: Bias. Because the report is published by cotton farmers, it heavily promotes cotton as the only sustainable choice and downplays the viability of recycled synthetics. The data is not objective.

  3. 3

    Step 3: Cross-Verification

    To mitigate this limitation, ThreadLogic does not rely solely on this internet report. They cross-reference the claims with published, peer-reviewed environmental science journals and official government trade statistics on textile imports, ensuring a more balanced and accurate forecast.

Published internet data is a powerful starting point, but its limitations—bias, lack of specificity, and questionable veracity—mean it must be critically evaluated before being used for major business decisions.

Practice Question

Which of the following is a primary advantage of using published government statistical data for business planning?

Practice Question

A company finds a free market research report on the internet published by a major competitor. What is the most significant limitation of using this report for internal decision-making?

Practice Question

Why is 'lack of specificity' considered a limitation of published internet data?

Objective d: Identify the data capture costs of management accounting information.

Information is not free. The process of gathering, processing, and storing data incurs significant costs, which must be weighed against the benefits the information provides (the cost-benefit attribute). Data capture costs can be divided into direct and indirect costs. Direct costs are the obvious, out-of-pocket expenses required to acquire the data. This includes purchasing hardware (like barcode scanners, RFID readers, or servers), buying software licenses (like ERP systems or database management tools), paying subscriptions for external market data, and the wages of the data entry clerks or IT staff who manage the systems.

Indirect costs are less obvious but equally impactful. These include the time spent by operational staff recording data instead of doing their primary jobs. For example, if a highly paid engineer has to spend 30 minutes a day filling out detailed timesheets for cost-allocation purposes, the cost of that captured data includes the lost productivity of that engineer. Other indirect costs include the cost of data storage (cloud hosting fees), data security (cybersecurity measures to protect the captured data), and the cost of errors if inaccurate data is captured and needs to be corrected.

Consider 'ChainTrace', a company that uses blockchain technology to track the provenance of organic coffee beans from farm to cup. To provide management with real-time inventory information, ChainTrace must capture data at every step. The direct costs include buying rugged tablets for the farmers, paying for satellite internet in remote areas, and licensing the blockchain software. The indirect costs include the extra 5 minutes it takes a warehouse worker to scan and verify each sack of beans, slowing down the loading process. Management must ensure that the premium price they can charge for 'fully traceable' coffee outweighs these substantial data capture costs.

Common Mistake

Ignoring Indirect Costs

Students often only think of IT hardware when asked about data capture costs. Do not forget the human element: the time operational staff spend filling out forms or scanning items is a massive indirect cost of data capture.

ChainTrace: Evaluating Data Capture Methods
  1. 1

    Step 1: The Manual Entry Option

    ChainTrace considers having warehouse staff manually type the serial number of each coffee sack into a computer. The direct cost is low (just a basic PC). However, the indirect cost is massive: it takes 2 minutes per sack, causing huge delays in shipping, and human error leads to frequent inventory mismatches.

  2. 2

    Step 2: The RFID Option

    Alternatively, they consider attaching RFID (Radio Frequency Identification) tags to each sack. The direct costs are high: purchasing thousands of tags and installing expensive RFID reader gates at the warehouse doors.

  3. 3

    Step 3: The Cost-Benefit Decision

    The management accountant calculates that while the RFID system has high direct costs, it eliminates the indirect cost of lost warehouse productivity (sacks are scanned instantly as the forklift drives through the gate) and eliminates data entry errors. The overall cost-benefit favors the expensive RFID system.

Evaluating data capture costs requires looking at the total cost of ownership, balancing cheap hardware against expensive human time and error rates.

Practice Question

Which of the following represents an indirect cost of capturing management accounting data?

Practice Question

A company decides to upgrade from manual data entry to barcode scanners. What is the most likely impact on data capture costs?

Practice Question

When a management accountant assesses whether to implement a new, highly detailed data capture system, which fundamental principle of good information must they apply?