컴퓨터구조 공부6

Computer Architecture

우리가 항상 사용하는 컴퓨터는 정말 복잡한 시스템으로 구성되어 있다.


Memory Hierarchy

Locality

  • 컴퓨터가 메모리에 한 번 접근했다면 다시 그 주소에 접근할 가능성이 크다.
  • Temporal locality와 Spatial locality로 나눌 수 있다.

Temporal locality

  • Items accessed recently are likely to be accessed again soon
  • ex) instructions in a loop, induction variables

Spatial locality

  • Items near those accessed recently are likely to be accessed soon
  • ex) sequential instruction access, array data

이러한 Locality를 최대한 활용하여 Advantage를 얻어야 한다.

-> Memory Hierarchy 등장

  • disk에 모든 것을 저장
  • 최근에 accessed된 내용을 disk에서 더 작은 DRAM memory에 저장 (main memory)
  • 그 중에서도 또 가장 최근에 accessed된 item을 더 작은 SRAM memory에 저장
    • Cache memory가 CPU에 붙어있음

  • Block (aka line) : unit of copying
    • May be multiple words (4 bytes)
  • If accessed data is present in upper level
    • Hit : access satisfied by upper level
      • Hit ratio : hits/accesses
  • If accessed data is absent
    • Miss : block copied from lower level
    • Time taken : miss penalty
    • Miss ratio : misses/accesses = 1 - hit ratio
  • Then accessed data supplied from upper level

-> Hit ratio를 높이고 Miss rate를 낮추는 것이 굉장히 중요함. Penalty를 낮출 수 있음.


Cache Memory

  • The level of the memory hierarchy closest to the CPU
  • 여러가지 맵핑 방법들이 존재함

Tags and Valid Bits

  • tag로 cache를 관리
  • valid로 사용할 것인지 아닌지 관리
    • Initially 0

Cache

Block Size Considerations

  • Larger blocks should reduce miss rate
    • Due to spatial locality
  • But in a fixed-sized cache
    • Larger blocks -> fewer of them
      • More competition -> increased miss rate
    • Larger blocks -> pollution
  • Larger miss penalty
    • Can override benefit of reduced miss rate

Cache Misses

  • On cache hit, CPU proceeds normally
  • On cache miss
    • Stall the CPU pipeline
    • Fetch block from next level of hierarchy
    • Instruction cache miss
      • Restart instruction fetch
    • Data cache miss
      • complete data access

-> cache miss 발생하지 않도록 hierarchy 개선

Write-Through

  • On data-write hit, could just update the block in cache
    • But then cache and memory would be inconsistent
  • Write-Through : also update memory
  • But makes write take longer
  • Solution: write buffer
    • Holds data waiting to be written to memory
    • CPU continues immediately
      • Only stalls on write if write buffer is already full

Write-Back

  • On data-write hit, just update the block in cache
    • Keep track of whether each block is dirty
  • When a dirty block is replaced
    • Write it back to memory

-> 즉, write-through는 memory에 즉시 업데이트, Write-Back은 dirty를 보고 memory에 업데이트


Measuring Cache Performance

$\mbox{Memory stall cycles} = \frac{\mbox{Memory accesses}}{\mbox{Program}}\times\mbox{Miss rate}\times\mbox{Miss penalty}=\frac{\mbox{Instructions}}{\mbox{Program}}\times\frac{\mbox{Misses}}{\mbox{Instruction}}\times\mbox{Miss penalty}$


Average Access Time

$\mbox{AMAT}=\mbox{Hit time}+\mbox{Miss rate}\times\mbox{Miss penalty}$


Performance Summary

  • When CPU performance increased
    • Miss penalty becomes more significant
  • Decreasing base CPI
    • Greater proportion of time spent on memory stalls
  • Increasing clock rate
    • Memory stalls account for more CPU cycles
  • cache behavior

-> Cache behavior가 상당히 성능측에서 중요한 역할을 담당함.


Associative Caches

  • Fully associative
    • Allow a given block to go in any cache entry
    • Requires all entries to be searched at once
    • Comparator per entry (expensive)
  • n-way set associative
    • Each set contains n entries
    • Block number determines which set
      • (Block number) modulo (#Sets in cache)
    • Search all entries in a given set at once
    • n comparators (less expensive)

Cache


Replacement Policy

  • Direct mapped : no choice
  • Set associative
    • Prefer non-valid entry, if there is one
    • Otherwise, choose among entries in the set
  • Least-recently used (LRU)
    • Choose the one unused for the longest time
  • Random
    • Gives approximately the same performance as LRU for high associativity

Write - allocate

  • 쓰기 연산 수행할 때 cache miss 발생시, cache로 먼저 가져온 다음, cache에서 데이터를 쓰는 방식

Write - no allocate

  • 쓰기 연산 수행할 때 cache miss 발생시, memory에 바로 쓰는 방식 -> memory access가 빈번하게 발생할 수 있음

Cache


Cache Coherence Problem

  • multi core 환경에서 동일한 memory 주소에 access할 시 발생함.

예시)

Cache

위와 같은 coherence 문제를 해결하기 위해, invalidate를 두어 coherence를 맞춰줌

Cache

Leave a comment