From Bitcoin Wiki
CryptoNight relies on random access to the slow memory and emphasizes latency dependence. Each new block depends on all the previous blocks (unlike, for example, scrypt). The algorithm requires about 2 Mb per instance:
- It fits in the L3 cache (per core) of modern processors.
- A megabyte of internal memory is almost unacceptable for the modern ASICs.
- GPUs may run hundreds of concurrent instances, but they are limited in other ways. GDDR5 memory is slower than the CPU L3 cache and remarkable for its bandwidth, not random access speed.
- Significant expansion of the scratchpad would require an increase in iterations, which in turn implies an overall time increase. "Heavy" calls in a trustless p2p network may lead to serious vulnerabilities, because nodes are obliged to check every new block's proof-of-work. If a node spends a considerable amount of time on each hash evaluation, it can be easily DDoSed by a flood of fake objects with arbitrary work data (nonce values).