2009年9月7日 星期一

小圈圈

因為從來不是好野人, 所以從小沒出過什麼國, 但是拜工作之賜, 日本和美國都去過了, 所以就以這兩個地方, 從自己親身體會到的, 作為比較之基準吧...

日本人口很擁擠, 在都會區更是人擠人, 不是都會區的話人口就沒那麼多, 就和台灣一般鄉下的感覺差不多, 建築密集度也差不多, 但是在日本有一個很直接可以感覺到的特點, 就是有紀律; 在新宿的行人道上, 小小幾步路就可以跨過去的紅燈路口, 一大堆行人在等紅燈, 就是沒人會去跨這兩三步, 這在台灣, 不去跨它還會被當成這個人是不是白痴呢...

也曾經在日本郊區的某個停車場, 這裡沒有紅綠燈, 但是車子遇到行人時, 也會停下來讓你先走, 這和在美國時遇到的情形類似; 在美國開車, 更是隨時都要準備停下來, 除了遇到 stop sign 一定要停之外, 欲到行人, 或是欲到路線奇怪的車子, 它們一定是完全停下來等你先走; 美國的州很大, 沒太多綠口有紅綠燈, 所以 stop sign 顯得很重要, 而且神奇的是, 在十字路口或丁字路口, stop 之後, 大家都很有默契的以一定的順序輪流讓車, 這是我的臨場理解, 並不確定是否交通法規有規定路口要這樣輪流通過, 但是這種感覺很棒, 很有效率; 有紀律的效率, 在美國才會被認為是聰明的, 在台灣, 一定是不佔領路面, 不塞在路口就被當成白痴, 結果就是大家都很緩慢的嚕來嚕去... 這叫 smart 嗎? 然後再怪路不夠寬啦, 車太多啦... 奇怪的是, 我覺得美國的路也沒有比較寬啊! 只有一線的馬路比比皆是~

人口密集, 鬆散且沒有尊重他人習慣的社會, 容易爆出糾紛, 人禍不斷, 社會就不安; 台灣像個小圈圈, 以錯誤的習慣為常態, 沒有尊重與 refine 制度的能力, 結果應是不斷地將珍貴的資源, 花在內部耗損上; 每個人將時間浪費在解決日常糾紛上; 沒有讀書的習慣; 終有一天, 資源集中在持續進步的別人, 吹你一口氣就倒了... 就會發現, 口號喊的再響, 就像在洪水中喊救命的災民一樣, 還是被洪水淹沒...

還是努力修身養性, 思考如何跳離這個小圈圈吧...
#

2009年9月1日 星期二

git command line reference ...

花了一點時間把前陣子 study 時做的 git command line 整理成兩篇文章 for public:
git 雖只是 command line 的 tool, 因為它的特殊 feature, 用起來卻比前工作時用過的商業版 perforce 順手多了, 即使後者有完整的 GUI front end...

現代真比古早人進步嗎?

小時候總是有這種印象, 現代的醫學非常進步, 不論什麼病, 都能治的好... 當時腦海中唯一的 boundary condition 是只要有錢 :P

於是唸書時用力給它和同學拼熬夜... 最後終於發現爆肝其實醫學無法幫太多忙... 還是醫學制度無用武之地? 一兩分鐘的問診時間... 菜鳥醫生也要裝的很專業... 醫院像生產線有 specification 的 bug 才能解...

不是有很多換心換肝換腎的都OK嗎? 還是要一直相信現代醫學吧?! 真的是這樣嗎? 發燒最好是到別家啦... 妳的流感如果太輕只是統計數字...

事後補救總是有盲點, 預防事情發生總是成本比較低... 對自己來說, 把觀念改過來, 像是清淡生活&&多運動, 應該是比成為統計數字, 或是被當作生產線上的bug, 來的強吧~~

2009年8月28日 星期五

複雜就是好? 還是這其實就是目的?

OpenEmbedded 這套 package management 機制果真神奇... 但是也給它有些複雜...

Open Source 的世界深又廣, 大家在自己的領域貢獻出智慧, 欲加以組織時, 動不動就 python, perl, java, tex, m4, ... 往往也是 system integration 的障礙, 一個小地方不能run就需花大半天找問題, 但也是service的謀生之道的吧?!

O_O

feeling bad ...

有種痛, 像刀傷一樣...
剛劃下去的一剎納, 明知不對勁也感覺不到什麼異樣,
非得等它發酵之後才知這一刀劃的有多深...

劃也劃過了, 聽老天安排走完人生吧...

2009年7月13日 星期一

Red Bull 的成功 ...

Red Bull 在德國站又下一成, 連續兩站包下一二名的位置, Adrian Newey 的大師功力終於發揮; 另外 Mark Webber 在F1二線車隊熬了那麼久, 也終於嚐到完全勝利的滋味. 這讓人感嘆, 一家公司的成功, 天才型的員工, 實力的技術總監, 積極有遠見的老闆, 三這缺一不可...

有興趣的人可以研究一下 Red Bull 車隊的歷史, 就知道這個二線車隊如何輾轉歷路達到現在的成就...

2009年6月21日 星期日

兩顆輪子果然比兩隻腳跑的精彩...

一直以來都是喜歡慢跑的, 就算當兵那種操人的慢跑都不會減低我對跑步的喜愛... 但是最近搬到工作地點附近之後, 想找點事在下班時間做, 順便練練體能, 於是把26"小摺塞在後行李箱帶著趴趴走, 連續幾次下來, 發現這兩顆輪子不簡單... 能在短時間內到用跑步也到不了的許多地方, 這下有趣了...

只是為什麼上坡時總是被人家遠遠甩在後頭? 加油加油... 總有一天會重現國中時代用自行車尬機車的體能的... Orz




2009年6月7日 星期日

紅海... 藍海... 真有分別...

今天有時間把n年前買的26" 摺疊自行車整理一下, 21段變速/越野胎... 記得當初花 3800 吧... 應該是在自行車忽然風行前2年買的...

剛上網看了一下規格相近的車種, 發現... 這年頭 ... 自行車比 notebook 貴...

這帶給我們什麼啟示呢 ???

2009年5月9日 星期六

可怕的宅阿媽...

隔壁宅阿媽, 整天呆在家裡, 看著隔壁誰誰誰幾點上班下班, 誰信箱裡有什麼信, 週末在家就敲敲打打裝忙... 苦了必須繼續讀 datasheet 的可憐工程師... 咳~ 工程師就算能力再強, 還是要回到現實面對無奈的社會...

孟母三遷原來其來有自... 成龍說台灣亂其實也不為過...

看來搬家計畫也要放到 queue 裡了... 認真考慮中~

shit! >_<

2009年5月6日 星期三

懺悔時間可能到了...

最近連續做出一串錯誤的決策, 對於一向直覺精準的蠍寶寶來說, 感覺超差的...

也許是長時間受制於一堆習慣做錯誤決策再來慢慢修改的人群中吧, 雖盡力保持間距, 長時間下來還是受到不小的影響...

低調後反省再反省... 這是蠍蠍目前僅剩的能力了~~ 何時能尋得再出發的角度呢??

2009年5月5日 星期二

What is easy?

浪裡來, 浪裡去, 浪裡廝殺, 高手對決... exciting, but too hard, low performance/cost ratio~
垃圾堆裡檢黃金, 淤泥中找蓮花... boring, but easy, high efficiency~

ps. 功夫還是要練, 方能成為高手...

2009年4月25日 星期六

Redirect to a new site ...

We've found a fantastic place to host the personal works. Special appreciate to Yihsun's clever discovery.

This will link to the new site.

2009年4月20日 星期一

Backyard garden ...

This is a top grade place in Taiwan to take pictures. Since the government finished the construction of several new highway, to be there is like to get to a backyard garden.

This is a broken picture due to a) high-ISO setting, and b) no stars. It will be replaced when finally the clouds is gone and instead a galaxy of stars appear in the frame of my camera.

This place has more my spare time works, for those who are interested with.

2009年4月4日 星期六

An off-time secret mission ...

上禮拜將處理 data 的 code 清理乾淨之後, 增加了 data repository snapshot 的功能, 程式的 load-in 明顯快了許多, code 也更容易架上新的功能; 今天再把繪圖的部份也清理了一下, 因為鼻子不舒服, 整天暈暈的, 到了傍晚終於看到月線, 季線, 半年線, 和年線了, 算是有持續的進展 ... 紀錄一下 (照例馬賽克一下)~


希望盡快將這個計畫處理到一個段落~ 其它 queue 起來的任務都快發霉了.

這個月多定了好幾個 HBO 頻道, 終於有時間來去轉一轉了 ...

Update: This project has been hosted here. Thanks to the help from Yihsun.

2009年3月18日 星期三

Beagleboard Specification Overview (I)

After spending some time to collect documents of all on-board devices, what is found is the exciting multimedia capability of the MPU and, especially, the IVA2.2 subsystem. IVA is the acronym of Image, Video, and Audio subsystem. It is especially amazing that the SGX module equips an universal shader engine which enables the capability to service not only the MS shader model 3 (VS and PS; GS is not mentioned in the spec so has to be invetigated in advance) and OpenGL2.0 fragment shading language (GLSL), but also, supposely, the programmable video codec engine. Universal shader engine is the trend of GPU design in 2 years ago when I'm still a Direct3D driver engineer.

Following list highlights the important features of the MPU and IVA2.2 subsystems ...

1. MPU subsystem modules

- ARM subchip

Cortex-A8 core (r1p1)

ARMv7 ISA (Thumb-2, Jazelle RCT)

NEON SIMD coprocessor (VFP lite, media streaming)

16K I/D L1 cache (4 way set associative, 64 bytes cache line), 256K L2

- INTC (96 synchronous int lines)

- Asynchronous interface with core logic (this probably refers to the AXI interface)

- ICE-Crusher, ETM, ETB modules

2. IVA subsystem features (image/video/audio)

- IVA2.2, based on TMS320DMC64x+ VLIW DSP

- 32-bit fixed point

- VLIW

* 6 ALU

* 8 instructions per cycle, 8 execution units

8x8 MAC (multiply accumulate)

6x6 MAC

((a+b+c) >> 1) interpolation

two 32x32 multiply per cycle

- dynamically mixed 16/32 bit instruction sets

- 32K P(direct-mapped)/D(2-way-set 4bytes cache line) L1 cache, 48K SRAM

64K(2-way-set 128bytes cache line) L2, 32K SRAM, 16K ROM

- private DMAC

128 channel, 1D/2D addressing, 64-bit read x2 and write x2 ports, 32/64 bytes burst

- L1 INTC

- 32 entry MMU

3. PowerVR SGX

- hw 2D operations: vector graphics, BLTs, ROPs

- universal shader engine (MS VS/PS3.0, OGL 2.0)

- geometry dma

- virtualized memory addressing

- programmable video codec support: H264, H263, MPEG4, WMV9, JPEG

- 2048 x 2048 x 24bpp resolution

- 256 24-bit palette entries

- picture-in-picture(overlay), color-space conversion, rotation

- remote frame buffer support

- LCD pixel/bus interfaces (MIPI DPI/DBI 1.0)

- NTSC/PAL CVBS & S-video analog output

4. Four DMA controllers in the system

- sDMA (system DMA)

* one read and write ports

* 32 channels (prioritized)

* 96 hw-reqs

* 256 x 32 FIFO

- EDMA (enhanced DMA)

for IVA2.2 subsystem

- Display DMA

- USB HS DMA

As a novice user of a new development board, the first few things to clarify is the system power and clock distribution and the memory space mapping. Let's forget the former temporarily which is usually just a tedious spec reading, and start with the relatively "interesting" topic: the mapping of the memory space.

According to the TI application processor technical reference manual (spruf98b), the system memory space has two level of partition granularity in the 4G space:

- Level 1 - 4 quaters (Q1, Q2, Q3, Q4), each corresponds to a 1G address space

Q0 (0x00000000 ~ 0x3fffffff)

boot space (1M)

GPMC (1G)

Q1 (0x40000000 ~ 0x7fffffff)

on-chip memory (128M)

boot-rom, SRAM

L4 interconnects (128M)

L4-core (16M), L4-wakeup (256K), L4-peripherals (1M)

L4-emu (64M)

L3 interconnect (128M)

L3-control regs, SMS regs, SDRC regs, GPMC regs (16M each)

SGX (64M)

IVA2.2 (64M)

SDRC-SMS virtual address space (256M)

Q2 (0x80000000 ~ 0xbfffffff)

SDRC cs0 (512M)

SDRC cs1 (512M)

Q3 (0xc0000000 ~ 0xffffffff)

SDRC-SMS virtual address space (512M)

- Level 2 - 8 blocks, 12M each, mapped to target spaces, where target spaces includes:

* Boot Space

1M in GPMC(Q0) or (according to sys_boot5 pin configuration) ...

On-chip ROM(0x40000000~0x400fffff) space

* GPMC Space

1G Q0

8 chip selects

gpmc_ncs0 to gpmc_ncs7

16M ~ 128M each block

programmable base and size

* SDRC space

1G Q2

2 chip selects

sdrc_ncs0 and sdrc_ncs1

64M ~ 512M each block

fixed base (0x80000000), programmable size for sdrc_ncs0

programmable base (default 0xA0000000) and size for sdrc_ncs1

* VRFB / SDRC-SMS (access to SDRC space through the rotation engine)

256M Q1

512M Q3

So now we have an initial view of system bus setup and the deployment of
subsystems and devices in the memory space. Are these listed features excites
you either? Let's examine them one by one in the upcoming study plans.

- ARM Cortex-A8 Technical Reference Manual

- OMAP3530 Application Processor Technical References

2009年3月17日 星期二

好久沒細細聆聽美麗的旋律了...

最近在 ICRT 狂打, 有點鄉村歌曲味道的歌曲; 試著貼貼看 ...



啊~ 聽到美的旋律, 秘密任務又要多 queue 好幾條了...

2009年3月10日 星期二

Beagleboard 偷偷開箱囉 ...

Beagleboad community: http://beagleboard.org/

偷偷放下下班後的主要任務, 跑去將一些配件收集齊全, 終於在到貨的第二天可以測一下板子是好是壞, run 的是 community 的 diagnostic process; 以下是測試概要...

打開大大的箱子, 挖出小到幾乎像 PDA 的 PCB, 將一些配件連上後的合影:


另外蒐集的配件有 ...

  1. HDMI to DVI-D converter; 版子並不支援 HDMI, 只是透過 HDMI 的 connector 接上有 DVI-D interface 的 LCD monitor.
  2. USB A(母) to A(母); 版子上的 connector 是 mini-B, 賣場遍尋不著 mini-B to mini-B 的 cable, 所以只好用 mini-B to A(公) 的 cable, 再用這個轉接頭接上 HUB, 這樣就可以插 USB ethernet adapter 以及 USB keyboad/mouse 了;
  3. Externally powered USB HUB. 當 OTG port 為 host mode 時, 最多只能提供 100 mA 的電流, 因此 device 的 power 需求如果超過這個標準, 就必須由能外接電源的 HUB 來提供.
  4. USB ethernet adapter; 這個先備用... 因為不知道 linux 有沒它的 driver...





重點來了, 連好 null modem, 打開 terminal, 接上 5V 電源後, 第一次開機的 terminal 畫面:


bootcode 顯示在 22" LCD 的畫面 ...


照著 manual 在 SD 卡準備好 uboot 的 linux image 以及 ramdisk 之後, boot 進了熟悉的 linux command prompt, LCD 也是顯示熟悉的 framebufer driver 下的企鵝...


跑一下 default 的 video testing sample; 哇~ 好順... (相機 ISO 開到 1600, 有幾張還是晃到了)






跑一下 audio, 用的是 aplay 播 wav file...

[root@beagleboard mmc]# aplay -t wav -c 2 -r 44100 -f S16_LE -v victory-orchestral.wav
Playing WAVE 'victory-orchestral.wav' : Signed 16 bit Little Endian, Rate 44100 Hz, Stereo
Plug PCM: Hardware PCM card 0 'TWL4030' device 0 subdevice 0
Its setup is:
stream : PLAYBACK
access : RW_INTERLEAVED
format : S16_LE
subformat : STD
channels : 2
rate : 44100
exact rate : 44100 (44100/1)
msbits : 16

buffer_size : 32768

period_size : 2048

period_time : 46439

tick_time : 7812

tstamp_mode : NONE

period_step : 1

sleep_min : 0

avail_min : 2048

xfer_align : 2048

start_threshold : 32768

stop_threshold : 32768

silence_threshold: 0

silence_size : 0

boundary : 1073741824


嗯... ALSA PCM dma buffer size 32K, period size 2K, 都比我們家的 demo board 設定小很多 (256K/64K) 就能這麼順... 哈哈~ 不過這個 driver 還是有缺點啦, audio reset 的爆音超大的, 這個我們的 driver 就比她好囉... ccc~

瑕不掩瑜~

爆肝了~ let's call it a day! -_-
#over

2009年2月21日 星期六

什麼是 practical ?

常聽人說, 美國人很現實... 可是妳到他們所謂的 homeland 時, 會發現他們有很多設備, 有很多機器, 重點是都會動, 而且基本上很有規則地運作... 運作的規則可能很複雜, 但是都會特別簡化出重點來, 照著重點走, 就不會妨礙到別人... 當發現有人不照規則走的話, 就直接"告訴"她, 當然這個"告訴"的方式很多種... 基本上他們不浪費時間在做表面工夫, 所以"告訴"的方法一定很有效率!

當我們有一定的基礎時, 常常會害怕一切又從零開始... 現實生活中就是這種機會還蠻大的, 特別是所謂的基礎是純粹建立在好運僥倖或不公義上時; 重點是妳有沒有從零開始的能力, 當這種焠鍊臨到妳頭上時, 妳化解它的效率如何...

前朝長輩有留下東西, 這是先人辛苦奮鬥從零開始所建立起來的基礎; 後生晚輩要記住的不是這個基礎價值多少, 而應該是從零開始的經驗與技巧; 然後繼續在這個基礎上有效率地累積新資產...

開會作虛功, 就像在前人留下的資產內打轉, 沒有 energy input, 這個資產就會不斷內耗直到消逝...

2009年2月20日 星期五

跑步的話題

人年紀到的時候, 真能明白體能如何限制一個人的成就...

就像基因密碼一書提到的, 打從一出生時基因就幫妳安排好何時會有什麼疾病, 準時釋放; 但是上天的安排, 並非完全無法逆轉, 否則就沒有物競天擇這個名詞了...

小馬要求國軍體能也不無道理啦... 業務再繁重, 再多公文還沒辦, 都是欠國家的; 體能沒練好, 則是欠自己的, 這點可不能不認真!

PS. 最近發現殺彎也退步很多了... 嗯~

2009年1月30日 星期五

IA32 System Programming - Part VII

Task Switch

A processor supported task context (or state) is defined as a TSS structure, which includes the following fields:

Dynamic Fields
  • Segment selector registers (CS, SS, DS, ES, FS, and GS).
  • General purpose registers (EAX, EBX, ECX, EDX, EBX, EBP, ESP, ESI, and EDI).
  • The processor status register (EFLAGS).
  • The program counter register (EIP).
  • Links to previous task.
Static Fiels
  • Task LDT (local descriptor table) segment descriptor.
  • Task page directory base register (CR3/PDBR)
  • Stack pointers for privilege level 0~2.
Static TSS fields are usually initialized by system software during task creation time.

There are 4 cases the processor will transfer execution to another task:
  • A far call or jump directly to a TSS descriptor in the GDT.
  • A far call or jump indirectly to a task-gate descriptor in the GDT or the current LDT.
  • An asserted interrupt or exception vector points to a task-gate descriptor in the IDT.
  • An "iret" when the EFLAGS::NT flag is set.
Analogous to the call gate descriptor for indirect access to privileged procedures, the task gate descriptor is defined for protected indirect reference to tasks. CPL, RPL and DPL of target TSS descriptor are checked in a direct TSS call or jump. CPL, RPL and DPL of the task gate descriptor are checked in an indirect task switch. Processor states are saved or restored into/from the task context in the TSS structure.


IA32 System Programming - Part VI

Fast System Calls

Fast system calls are provided by the IA32 architecture with a low overhead mechanism for system software.

  • The “sysenter” instruction is for use by user code running at privilege level 3 to access system procedures running at privilege level 0.
  • The “sysexit” instruction is for use by system procedures running at privilege level 0 for fast returns to user code running at privilege level 3.

The target procedure entry point and tack pointer are predefined in fixed MSR (model specific register) addresses. MSRs could be accessed with the “rdmsr” and “wrmsr” privileged instructions. The overhead of complex privilege checks is simplified, and memory access to descriptor tables is then eliminated.



IA32 System Programming - Part V

Inter-Privilege-Level Call

Program control transfer to privileged code segments through call gate descriptors, which in turn contain information to the location of target code segments, is called an inter-privilege call. The call gate descriptor is specified in the far form of the call/jmp instruction. The processor performs various privilege level checking before loading new data to the CS and EIP registers. General rules include checking following fields:

  • CPL of current code segment.
  • RPL of the requestor (segment selector of the far form call/jmp instruction)
  • DPL of the call gate descriptor.
  • DPL of target code segment descriptor.

CPL, RPL and DPL of target code segment are checked for privilege level switch. In addition, the DPL of the call gate descriptor acts as a guardian to control who has the access right of the target code segment according to the requestor’s privilege level. For instance, system software components that are designed to be accessed by both the system software itself and application programs (e.g., device I/O interfaces) could be executed through call gates that allow access at all privilege levels (DPL 0~3). Services that are designed to be used by system software internally (e.g., device initialization procedures) should only be accessed through more privileged call gates (DPL 0 or 1).

Stack switch occurs automatically if CPL differs from target code segment DPL. CPL changes to destination DPL accordingly. Stack pointers should be defined for each the task in its TSS structure for each privilege level it uses. Stack unwind is performed by the processor automatically after a far return instruction.



IA32 System Programming - Part IV

Inter-Segment Call

To transfer program control directly to another code segments without privilege level change, the target procedure entry point is specified in the far form of a call/jmp instruction. The processor performs various privilege level checking before loading new data to the CS and EIP registers. Involved privilege level fields are:

  • CPL (the privilege level of current code segment which contains the source call or jmp instruction)
  • DPL (the privilege level of the target code segment descriptor which contains the target procedure)
  • RPL (the requestor’s segment selector privilege level in the call or jmp instruction)

System software executive that needs to be protected from user privilege codes are placed in the non-conforming code segments. Execution cannot be transferred to a less privileged code segment directly through direct call or jump; otherwise a general exception will be asserted by the processor.

Some type of exception handlers (e.g., divide-error or overflow) and system software components that don’t have to access protected facilities (e.g., math libraries) could be loaded in conforming code segments. They are executed in higher privilege levels while keeping the CPL unchanged, which prevents it from accessing more privileged data. In this way, system overhead in privilege level change is alleviated.

There is no CPL change in either conforming or non-conforming form of direct call or jump to a target code segment. Since the CPL does not change, no stack switch occurs.



IA32 System Programming - Part III

Program Control Transfer Overview

When the CR0::PE bit is set, the processor will switch to protected mode and enables segmentation. There is no single control bit to disable protected mode once the processor enters protected mode. Similarly, when the CR0:PG bit is set, the processor enables paging and there is no single mode-bit to disable paging mechanism.

In protected mode, the processor always performs its execution within a task context. There is at least one task defined in the system. In addition, except explicit scheduling policy performed by the system software, task dispatching, execution, and suspension are supported by the processor task management facility. A task context is defined as a structure called TSS (task state segment), which contains code execution space information (a code segment, one or more data segment, and a stack segment) and task state information (processor status, general purpose registers, program counter, page directory base, local descriptor base, I/O map base, and a link to previous task). Program control transfer between tasks is supported by the processor through either the direct task switch or the indirect way called task-gate. Task switch mechanism will be depicted in Part VII later.

Program control transfer without explicit user task switch involves inter-segment call, inter-privilege-level call, fast system call, and interrupt/exception. The former two will be depicted in Part IV and Part V. Fast system call will be discussed in Part VI. And interrupt/exception will be left as future topic temporarily. The corresponding formal documentation of these sections could be found in the IA32 Intel Architecture Software Developer’s Manual – Volume 3: System Programming Guide.


Before digging into these topics further, readers could refer to the IA32 Intel Architecture Software Developer’s Manual – Volume 2: Instruction Set Reference, section 3.2, for the format and usage of the “call”, “jmp”, and “ret” instructions. Near call and jump refer to program transfer to a procedure within the current code segment. This is not what we’re interested and the focus will be put on the far form of these instructions.


IA32 System Programming - Part II

Segmentation and Paging Overview

IA32 protected mode memory management is divided into two parts: segmentation and paging. Segmentation provides hardware supported linear address space partitioning for isolation and deployment of code, data, and stack sections. Paging provides mechanism for on-demand virtual to physical memory mapping which could be utilized to isolate and protect memory between multiple tasks. Minimal form of segmentation is required in IA32 protected mode. So there is no way to disable segmentation. Paging is, however, an optional function for system software.

Segmentation starts by using a 16-bit segment selector and a 32-bit offset to locate a particular byte in the processor’s linear address space. The “selector:offset” pair is called a logical address (also called the far pointer). A selector is used to identify/lookup a segment descriptor in the descriptor table. The TI field in the selector specifies whether to lookup in a global descriptor table (GDT) or in the local descriptor table (LDT). The GDT and LDT base addresses are specified by the GDTR and LDTR registers respectively. There are other types of descriptor tables, but they are not involved in the logical to linear address translation so are out of the scope here temporarily. The RPL field specifies the requestor’s privilege level and is involved in the complex privilege level checking facility we’ll depicted in the next few parts of this IA32 system programming series. Each entry of the descriptor is fixed to 8-bytes in size. The base and limit fields of a descriptor specify the base and range of the segment in the processor linear space. The flags field of a descriptor specifies the descriptor type. When the S descriptor flag is set, the descriptor type is ether a code or a data segment. When the S descriptor flag is clear, the descriptor is a system descriptor which includes system segment descriptors (LDT segment descriptor, TSS descriptor) and gate descriptors (call-gate descriptor, task-gate descriptor, interrupt-gate descriptor, and trap gate descriptor). Gate descriptors are some kind of “gate” which indirectly points to a code entry point in a code segment or a TSS segment.

If paging is not enabled, the linear address space is directly mapped to the physical address space of the processor. If enabled, the mapping is indirectly through levels of page tables. When paging is enabled, the linear address space is divided into fixed-size pages. The processor’s system register specifies the size of a page configured by the system software, which could be 4K, 2M , or 4M bytes. If the page of a linear address is not currently allocated with a physical page, a page fault exception will be asserted. The corresponding exception handler of system software typically catches the exception then allocates a physical frame and/or copy data from the disk for the linear address. The first level of page table is called a page directory, whose base address is specified by the system software in the CR3 system register. To minimize bus access required for address translation, the most recently translated entries are cached in the translation look aside buffers (TLBs). When the CR3 registers is reloaded, the processor will flush and invalidates previously cached contents so the system software is safe to discard cache coherency problems about TLBs.

The whole story above is abbreviated in the following figure.


2009年1月29日 星期四

IA32 System Programming - Part I


System Architecture Overview

Except the well-known general purpose registers and segmentation registers, IA32 x86 incorporates a set of system registers, data structures, and instructions in its system level architecture. Only privileged code is allowed to access these system level resources. This architecture includes:
  • EFLAGS register controls I/O, maskable interrupts, debugging, task switch and virtual-86 mode.
  • Control registers (CR0~CR4) control the operation mode of the processor and performs memory paging.
  • Debug registers (DR0~DR7) and instructions provides facilities for system debugging code and performance monitoring.
  • Descriptor registers (GDTR, LDTR, IDTR, TR), descriptor tables, and related load/store instructions control segmented memory management.