Cyber threats and vulnerabilities pose likely and imminent degrees of risk to the critical infrastructure grid which includes facilities, supervisory control and data acquisition systems (SCADA) and field devices. The intricate network architecture of the smart grid is exposed to hidden risks posed by interconnected heterogeneous devices from multiple vendors, integrated open source and commercial off-the-shelf (COTS) components, and minimum (or lack of) supply chain cyber hygiene. The advent of bring your own device (relays, breakers, air conditioning, water heaters, pumps, thermostats, chargers, batteries, remote terminal units, programmable logic controllers, wired/wireless telemetry, etc.) and remote access mechanisms for data access and control increase volatility and unpredictability in the loosely coupled eco-system. Upstream signaling and transmission from end user devices to servers expands the staging surface for attackers. The separation of duties between operation technology (OT) and information technology (IT) is a contextual gap between focus on reliability and availability versus security and vulnerability management.Regulations and standards, proposed by North American Electric Reliability Corporation (NERC), Federal Energy Regulatory Commission (FERC) and National Institute of Standards and Technology (NIST), such as Critical Infrastructure Protection (CIP Version 5) and the Cybersecurity Capability Maturity Model (C2M2) establish guidelines to enhance compliance and security. While compliance may reduce risks, compliance is not an assurance of security and does not eliminate all residual risks because audit checklists lag security punch lists due to dynamic flux and evolution of threat vectors.The traditional strategy of a multi-layer defense with security controls geared towards protections requires complementary and continuous monitoring as a checkpoint for risk aversion and zero tolerance. Excessive reliance on whitelisting and threat data, through intelligence sharing between intelligences agencies, public and private industry causes attribution bias, diagnosis deficit, and concerns about loss of privacy and anonymity. Proactive penetration testing and vulnerability scans in utility grids (bulk power systems, distribution systems, field and control systems, upstream devices) may be disruptive to normal operations and put systems at unnecessary risk.Coordinated cyber-attacks on utility grids are directed towards sabotage, disruption and ransom rather than theft of intellectual property. Landed malware (delivered as a malicious file attachment or hyperlink by email, through social networking, on a website or a USB device) uses lateral reconnaissance to scan and harvest information, enables attackers through a backdoor for remote command and control, and erases crucial evidence of traceability. Detection of advanced persistent threats requires alerting and reporting capabilities based on real time monitoring of network traffic and flow metrics, signaling integrity in loosely coupled ecosystem, and data exchanges between tiered silos for timely intervention and mitigation. Network device, system and application logs that are generated based on manual configuration of rules are prone to misconfiguration, human error and undiscovered exploits. Further, insider threats posed by authenticated and authorized malicious actors (disgruntled employees, compromised credentials, or physical intrusion to a facility) are a unique challenge in detection of abuse of privilege through behavior analysis, without requiring a cumbersome manual configuration to establish a reliable and trustworthy baseline reference of operational integrity.A security control (statement of policy) is a collaborative effort between the policy maker and the decision maker, to facilitate in the detection of threats and risks to a business process or operation, to enable mitigation actions to isolate the damaged asset and, to restore normalcy of asset functions for business continuity. This requires a holistic end-to-end approach to build resiliency through strategic systems and network design for fault tolerance based on asset redundancy (active/passive failover model of operation controls) and elimination of single points of operational failure. Identification of asset and impact to qualify activities is critical for early risk awareness and proportional response.