# NanoWattch: A Self-Powered 3-nW RISC-V SoC Operable from 160mV Photovoltaic Input with Integrated Temperature Sensing and Adaptive Performance Scaling Daniel S. Truesdell\*, Xinjian Liu\*, Jacob Breiholz, Shourya Gupta, Shuo Li, and Benton H. Calhoun University of Virginia, Charlottesville, VA, USA Email: dst4b@virginia.edu (\*equally-credited authors) #### Abstract This work presents NanoWattch, a self-powered SoC in 65-nm CMOS with integrated temperature sensing for miniaturized IoT applications. NanoWattch can cold-start and sustain operation directly from ambient light with a photovoltaic input as low as 160mV. A performance-scalable RISC-V processor with 6kB SRAM and DVFS subsystem enable system power consumption to continuously adapt to ambient energy conditions down to a minimum total system power of 3nW to provide always-on operation in a mm-scale form factor. #### Introduction Highly-miniaturized self-powered sensing systems enable the IoT to reach valuable new application spaces without obstructing space or creating maintenance demands. For particularly small form factors at the mm-scale where harvesting transducer size is limited, sensing nodes face severely-limited power budgets that can dip down to the low-nW range. To contend with this limited budget, it is critical to reduce system leakage to the nW-level while remaining adaptive to increases in ambient energy. Past works targeting this goal have synergized leakage-friendly technologies with low-leakage circuit designs and dramatically scaled back memory capacities to reduce static power. For example, the system in [1] leverages an extreme-low-power (ELP) deeplydepleted process in 55nm to reduce power. The systems in [2] and [3] revert to low-leakage 180nm technologies and leverage the dynamic-leakage suppression (DLS) logic style to reduce digital power, requiring the use of latch-based memories that occupy more area than SRAM macros. Additionally, despite the low power achieved, these systems lack granular performance scaling (>2 discrete modes) to facilitate adaptive performance to ambient energy harvesting. NanoWattch and its underlying techniques provide 3x-24x greater memory capacity vs. previous works while further contributing a novel Energy Harvesting and Power-Management Unit (EH-PMU) for efficient self-powered operation, increased functionality via integrated sensing, and highgranularity ambient energy tracking, all while maintaining nWlevel total system power in a modern technology node. ## NanoWattch Architecture Fig. 1 shows the system architecture of NanoWattch, consisting of the core domain (RISC-V and SRAM), the EH-PMU, and the DVFS control circuits. The RISC-V processor is implemented with the scalable DLS (SDLS) logic style [4] shown in Fig. 2 which enables standard cells to be fluidly scaled between a lowleakage/high-delay state and a high-leakage/low-delay state using two control voltages, $V_{CN}$ and $V_{CP}$ . A gate-leakage-based reference current generator is used with a dual charge pump design to regulate V<sub>CN</sub> and V<sub>CP</sub> in complementary fashion to a selection of 8 discrete values. Reference current tuning and VC mode selection are available through memory-mapped registers. Since the DLS logic designs from [2-4] lose efficacy at advanced technology nodes due to increased gate leakage, the SDLS cells in NanoWattch utilize minimum-sized thick-oxide devices, and the header and footer are forward body-biased to recover operating speed lost from increased node capacitances. A memory-mapped tunable critical path replication circuit tracks $V_{\text{CN}}$ and $V_{\text{CP}}$ and enables independent frequency scaling. The APB bus incorporates a sub-nW power-gateable temperature sensor [5], 8-bit GPIO, and 2 SPI modules. To improve memory capacity in NanoWattch versus prior latch-based approaches, a 13T DLS SRAM bitcell [6] is utilized (also with thick-oxide devices) in a novel 6kB macro design that uses standard CMOS logic gates in row/column decoders with automatic sub-clock cycle power gating. The EH-PMU generates a 0.6V rail from a photovoltaic source with a cold-start (CS) circuit and primary switched-capacitor (SC) boost converter (Fig. 4, right), a 1.2V rail from a voltage doubler (Fig. 4, left), and a 0.9V rail from an LDO. The CS supplies the rest of the EH-PMU until the 0.6V rail can instead provide regulation, and then the CS is power gated. Fig. 3 shows the schematic of the integrated two-dimensional maximal power point tracking (2D-MPPT), which uses an always-on fractional open circuit voltage (FOCV) tracking scheme for SC frequency modulation, and a memory-mapped 4b SAR ADC for SC conversion ratio modulation. The FOCV MPPT regulates V<sub>PV</sub> at K (initially 0.68) times the open circuit PV voltage for a sufficient duration, at which point the EH-PMU enters lock mode to indicate that the frequency has reached a steady-state. Hysteresis voltage regulation (HVR) runs in parallel and will discharge V<sub>OUT</sub> if it exceeds a high threshold, and will temporarily overclock itself (CLK<sub>REG</sub>) to quickly identify and abort once $V_{OUT} < V_{REFL}$ . To achieve low quiescent power and high harvesting efficiency, three techniques are implemented: 1) the EH-PMU power gates all blocks (with retention cells) aside from the FOCV MPPT when $V_{PV}$ is lower than a threshold and the frequency is locked, 2) the ADC and HVR sampling frequencies scale proportionally with the main SC output frequency to reduce dynamic power when harvested energy is low, 3) The FOCV coefficient K is tracked and optimized with a hill-climbing algorithm based on the frequency of the output regulation, REG<sub>EN</sub>. ## **Measurement Results** Fig. 5 demonstrates NanoWattch cold-starting from ambient light with the 0.6V and 1.2V shown powering up, and then powergating the CS circuit. Fig. 6 shows the measured SC power conversion efficiency and minimal PV input voltages for CS (290mV) and continuous harvesting (160mV) for 10 dies. Fig. 7 demonstrates ambient energy tracking while executing a code that reads from the temperature sensor, outputs the value to GPIO, then samples the EH-PMU ADC to detect changes in ambient energy harvesting, and responds with calibrated LUT-based DVFS adjustments to V<sub>CN</sub>, V<sub>CP</sub>, and the critical path replicator delay setting. Fig. 8 shows the die photo of NanoWattch in 65-nm CMOS with an active area of 3.1mm<sup>2</sup> and summarizes the core domain performance. Across V<sub>DD</sub>/V<sub>CN</sub>/V<sub>CP</sub>, the core achieves a 2900x frequency and 314x power scaling range (top-right) which is summarized by the pareto-optimal performance curve (bottomleft). Across V<sub>DD</sub>, the minimum energy-per-cycle is 168pJ at 0.55V, while the minimum total leakage power (active operation, 0Hz clock) is 1.3nW, and the maximum clock frequency is 850Hz. At the nominal system $V_{DD}$ (0.6V for core+SRAM), NanoWattch can scale down to 3nW total power and up to a maximum of 320Hz at 62nW. ## Acknowledgements This work was funded in part by MIT Lincoln Laboratory and the NSF NERC ASSIST Center (EEC-1160483). #### References [1] L. Lin, JSSC, 2021, pp. 1618–1629, [2] W. Lim, ISCCC, 2015, pp. 146–147, [3] X. Wu, VLSIC, 2018, pp. 191-192 [4] D. Truesdell, SSCL, 2019, pp. 57–60, [5] D. Truesdell, CICC, 2019 [6] S. Gupta, VLSIC, 2020 Fig. 1. Block diagram of NanoWattch SoC including SDLS core, DLS SRAM, and EHPMU with DPM. Power domains are indicated with colored markers. Fig. 2. Core circuit diagrams: Vcn/Vcp charge pump, ref. current generator, DLS bitcell with WL level shifter, and SDLS std. cell operating concept. Fig. 3. Schematic of the MPPT and supporting control circuits. Shaded area indicates always-on blocks. Fig. 4. Schematic of the switched-capacitor voltage doubler and boost converter, and body regulation circuits. Illustration of the 2D-MPPT algorithm operating concept is also shown. Fig. 5. PV cold-start showing 0.6V and 1.2V supply rails. Once the the 0.6V rail has booted, CS circuits are power-gated Fig. 6. Meas. EH-PMU efficiency and minimal harvesting Vin. Fig. 7. Adaptive performance scaling (DVFS) during ambient light fluctuations while taking temperature measurements. | A . | 12 | ****** | ereedane ee | | | | SW | eep v <sub>cn</sub> /\ | √cp | | | | |-----------------------------------------------------------------------------------------------------------------|-------------------------------|--------------------------|---------------------------------|------------------------|----------------------------------------------------------|-------------------|-------|----------------------------|-------|--------|-----------|-----------------| | 2mm — | 6kB<br>DLS<br>SRAM | ener<br>harve | | ver (W) | 10 <sup>-4</sup><br>10 <sup>-5</sup><br>10 <sup>-6</sup> | | | Ť | | П | | 10 <sup>3</sup> | | 2 | SDLS RISC-V<br>Microprocessor | | | Fotal System Power (W) | 10 <sup>-7</sup> | П | l | Ш | I | Ш | И | 10 <sup>1</sup> | | | - | – 3mm- | | <u> </u> | 10 <sup>-9</sup> | 4 7 | т. | | | • | | 100 | | 350<br>9 300<br>168pJ/cycle<br>min. energy<br>200<br>200<br>200<br>200<br>200<br>200<br>200<br>200<br>200<br>20 | | | | | 10 <sup>-10</sup> | 0.4 | 0.5 | 9nW @ 0.6\<br>0.6<br>Vdd ( | 0.7 | 0.8 | 0.9 | 10-1 | | ₩ 20 | 0.4 0.5 | 0.6<br>V <sub>dd</sub> ( | 0.7 0.8<br><b>V</b> ) | eakage Power (W) | 10 <sup>-7</sup> | 1.3nW<br>min. lea | akage | 2.9nW<br>@ 0.6V | | SRAM | VDDH | | | ₩ 40 | Pareto-Op | timal | Sweep | Me. | | • | | | 7 | $\bot$ | <b></b> ≣ | 1 | | 꽃 10 | | | V <sub>cn</sub> /V <sub>c</sub> | 9 8 | 10 <sup>-9</sup> | | | | | SF | RAM | | | <u>e</u> 10 | | | /- | cade | ' | | | | | | | 1 | | Max Freq (Hz) | No oco 6 | | Sweep Vdd<br>0.4V to 0.9V | Leak | | | | | | C | ore | | | 10 | 10 <sup>-9</sup> | 10 <sup>-8</sup> | 10-7 | 10-6 | 10 <sup>-11</sup> | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 | - | | | Tota | System | Power (W) | | | | | Vdc | 1 (A) | | | | 100n V--/V-- Fig. 8. NanoWattch die photo, measuring perf. scaling range vs. $V_{DD}$ and $V_{CN}/V_{CP}$ , min. E/cycle, pareto-optimal performance, and leakage breakdown. | | This Work | JSSC '21 [1] | ISSCC '15 [2] | VLSI '18 [3] | | |-------------------------------------|-------------------------------------------------------------|-----------------------------------------|------------------------|----------------------------------------------|--| | Processor | RISC-V (32b) | MSP430 | ARM Cortex M0+ | Arm Cortex M0+ | | | Technology (nm) | 65 LP | 180 | 180 | 55 XLP DDC | | | SoC Area (mm²) | 3.1 | 5.33 | 2.04 | 0.144 | | | Memory | 6kB SRAM | 2kB Latch | 0.25kB Latch | 0.5kB SRAM | | | Energy Harvesting | Photovoltaic | Photovoltaic | Photovoltaic | Photovoltaic | | | Min. Harvesting Input | 160mV / 150lx | N/A / 55lx | N/A / 240lx | N/A / 3000lx | | | Performance Modes | 8 (analog) | 2 (binary) | 1 | 1 | | | Adaptive Ambient<br>Energy Tracking | Yes | No | No | No | | | Peripherals | GPIO, SPI<br>DVFS control,<br>Temp. Sensor,<br>EH-PMU w/ CS | DMA, GPIO, dock<br>gen., EHPMU w/<br>CS | Clock gen | Optical TRX,<br>Temp. Sensor,<br>EHPMU w. CS | | | Supply Voltage (V) | 0.4 - 0.9 | 0.2 – 1.1 | 0.16 – 1.15 | 0.65 – 1.5 | | | Frequency Range | <1Hz – 850Hz | 1Hz – 2.8MHz | 2Hz – 15Hz | 46Hz* | | | Minimum Active<br>Power | 1.3nW @ 0.4V,<br>0.5Hz | 0.59nW @ 0.45V,<br>2Hz | 0.295nW@ 0.55V,<br>2Hz | 16nW | | Fig. 9. Performance summary and comparison to state-of-the-art. \*estimated from available data