Solaris 10 includes a new cryptography framework includes openssl under /usr/sfw. development of x86 and x86-64 dev did not stop during period sun was not making sol/x86 available to customers. fma: fault management architecture smf: service management facility big focus on lowering TCO (total cost of ownership) in solaris 10 they eat their own dogfood, and they start eating it early in the OS release lifecycle sun engineers use solaris on their desktops lots of innovation in sol 10 (the os), linus quoted saying most innovation in linux is above the os layer This section will cover little-known features in versions of Solaris prior to Solaris 10: observability: truss: -m machine faults -u function calls (like sotruss?) stop process: -T -S -M -U -c syscall stats pmap -s -x pfiles shows files, sockets, doors of a process replaces lsof, should not need lsof on sol 10 pstack can be run on a core file. Applies to each thread pargs print command arguments. Security implications? psig show signal disposition nohup -p - nohup for an already-running process preap - forces parent to reap child zombies prstat - like top. "Don't need top anymore" pretty detailed, see man page -L for threads. -z -Z? Look at zones ptree gcore - force a running process to core dump, but don't kill the process pldd - ldd for running processes pwdx - print working dir for a running process pcred - shows effective, real and set uid and gid pwait - waits for a process to terminate resource management: processor set- added in sol 8 bind processors to processes free, dynamic, easy change without reboot stateless, not good, so script it sol 9 - resource pools: stateful processor sets fair share scheduler fixed slice scheduler (fx) once a processor is in a processor set, only added processes will run on that cpu don't use processor sets on 1 cpu machines can interact poorly with heavy interrupts, say, due to gigabit or 10 gigabit nic. Can disable interrupts for a processor set psradm psrset projects: a persistent namespace /etc/projects bind applications together for managing them collectively poolbind rctladm rcapd app with no more 10 G of RAM harder than cpu management ability will be added to future sol 10 release "resource cap" for memory libumem: heap allocator, intended to work well for more kinds of work loads (mem usage) like mallloc() like unify theory in physics :) kma: kernel memory allocator duplicated in some linux, *bsd libumem may help efficiency a lot (in applications that use the heap?), lecturer has seen 3x improvement LD_PRELOAD'able includes nice set of debugging features accessed via mdb debugger real time: priocntl fully preemptive kernel see CLOCK-HIGHRES In timer_create(3RT) end of little known features in prev releases of solaris Covering Solaris 10 information now containers (zones) - like FreeBSD jails dynamic resource pools dtrace predictive self healing: fma, smf process rights management: finer grained than root/nonroot. /etc/user_attr zettabyte filesystem (ZFS). Not in first rel of sol 10, hopefully in 2nd, not sure. Hopefully in "solaris 10 update 2". resource management project - task - process - thread resource pools kernel requires at least one cpu for itself srm - service resource manager (?) sol 9 - no need to buy srm, can use fss - not priorities, but shares - allocate by proportions more or less may get more, but never less shares than allocated define total num and what gets how many shares rctladm(1M) sol 9 see man page resource limits ulimit -a /etc/system no longer required for installing a commercial database. Will use if there, but native is resource control. Defaults larger sun wants us doing less and less with /etc/system sol 10: dynamic resource pools poold - if proj not getting needed resources, poold will shuffle resources to satisfy sol 10: containers zones lecturer not sure if same thing some say containers are more than zones, but how? sol 10 always boots into global zone zone create takes 20 minutes on lecturers laptop zoneadm list -iv zlogin to log into a specific zone no pid overlap between zones. Only global zone sees all pids on system single instance of solaris in multizone scenario nonglobal zones don't see /dev, for example cannot run dtrace in nonglobal zone yet, they are working on this can config what filesystems are visible can inherit filesystems, can make filesystems readonly or rw from global zone, can cd into local zone's hierarchies. root, dev. df output different in local zones. Does not show /dev info every zone can have a unique network identity ip traffic routed through global zone's NIC much like virtual interfaces apps can bind to INADDR_ANY, but still only get connects for that zone ports unique to each zone breakins: may make a mess of a local zone but should not be able to get to global zone (or other local zones?) don't nfs export from global zone to a local zone "there are issues with that" /dev/*mem /dev/dsk /dev/rdsk &c not visible in local zones zonecfg(1 m) - give name, path, network, autoboot, pool (which resource pool) config data in xml format packages and patches + zones install in global zone, shows up in local zones can also install only in global zone also in local zones pkgadd and friends changed to control this behavior inherit-pkg-dir: dir in local zone that should inherit from global zone loopback and readonly by default zonecfg -z zone create set zonepath= set autoboot=false ... /etc/zones holds config data can be vi'd but... zoneadm list -cv (opts?) zoneadm -z name halt zoneadm -z name boot boots in a couple seconds zlogin zonename www.solarisinternals.com pdf from talk should be there, in current version, by tonight dtrace -l list all probes, point of data collection 42000+ probes every probe has a unique int identifier, module, function, name docs.sun.com download dtrace doc mdb - ufs_read::dis (?) disassemble ufs_read dtrace -n ufs-read:entry disassembly changes when you set a probe code changes to enable probe disable the probe, orig code is restored one of primary goals was to keep dtrace safe enough to use on a production system. Should not be possible to cause a crash with dtrace very thorough error checking can aggregate data to cut userspace postprocessing dtrace -n 'syscall:::entry { @sc[execname,probefunc] = count() }' blank fields match all occurrences shows commands and their syscalls probefunc is name of function execname is name of program @ is an associative array 'instant gratification' instrumentation uses language similar to c or awk lockstat has been around a long time. Uses dtrace in sol 10 plockstat looks at user apps, added in sol 10 both look at kernel locks dtrace intermediate format: virtual risc architecture, used for executing dtrace commands dtrace -l -P sched list all probes related to scheduler can sometimes guess meanings of dtrace providers without being os internals expert fpt - big dtrace provider, for every func in kernel, can set a probe for every func in kernel "pid" dtrace "provider" dtrace -n 'pid111:::entry' 111 is pid number no action spec'd probe fires and lists dtrace -n 'pid10161:::entry { @s[probefunc] = count() }' show all in pid 10161 if you do not restrict what to instrument, dtrace will attempt to instrument every instruction, and often fail to do so due to lack of memory no recompilation necessary for using dtrace sdt provider can get more info from progs you have source to using dtrace With interpreted languages (python, java, ruby, bash, perl...), you have to run against the interpreter, not the script (of course), but they are working on java script support for dtrace. dtrace language is called "d", borrows from c and awk many example d scripts are available dtrace -s dscriptname.d quantize(n) can give log base 2 result, useful for weeding out hoardes of insignificantly small items, and making the big items show up. count() is linear on the other hand. / cond / is a form of "if"? no explicit loops in d, to keep safety - otherwise, you could accidentally create an infinite loop, crashing the system. But there are implicit loops, kind of like in a database query language. Most related technologies can crash systems if used "incorrectly". dtrace can identify all ioctl()'s used by a given process dtrace can delve into what happens inside a system call tfork.d can essentially sotruss the kernel during fork() docs.sun.com blogs.sun.com dtrace not just for: diagnosis kernel engineers service persnnel system administrators developers zonename command there's a zone variable in dtrace, so even though you cannot currently dtrace in a local zone, you can dtrace only things in a particular zone from the global zone should add zonename to PS1 if available :) dtrace (d language) has awk-like BEGIN END also has c-like printa like printf? supports c-like operators printa: print aggregation does d have printf too? Predictive self healing fma smf - AKA "green line", named after a train in Boston fma: fault management architecture The point of FMA is to associate every error with a corrective action. Automate if possible, otherwise notify admin. Also attempts to name errors, for easy lookup on Sun's website. eg: cpu starts generating soft errors- fma took cpu offline, rather than panic fma: detect errors data capture - describe naming errors with fmri event protocol diagnosis dependency action history no big block of nonsense to sift through and use for diagnosis go to www.sun.com, cut and paste error message id into form, get info did the lecturer say that sol 10 can offline individual memory pages? fault diagnosis tech in sol 10 called "eversholt" diagnose fault tree language "eversholt" compiled simulation environment You're unlikely to need to know eversholt, as an admin. It sounds like kernel engineers might be interested though. fma is part of the solaris kernel error handler > fault manager > ... fmadm fmdump - check logs fmstat maybe add some of this to oacstats fmstat -a takes a few minutes hardware specific: fma fma components are specific to pieces of hardware, but many parts of fma are generic fmstat -a hangs in local zones: run it in the global zone to look up fma errors: http://sun.com/msg/ fmadm config /etc /etc/rc*.d, init and inetd: "ad hoc" smf replaces these dependencies across a network for SMF: no not yet, but it's the next logical step smf and fma both use fmri "fault management resource identifier", or a name for a resource svcs gives fmri , time service started, whether it's running not -everything- from sun has migrated to smf yet, sounds like they ran out of time for doing so before sol 10 was frozen right now smf is process oriented, but later you'll be able to give a verification method that could test for an active tcp port, etc (Actually, this would be trivial to add with some shell or python scripting....) svcs -p fmri fmri eg network/smtp-sendmail svcs -D network/physical list all services depend on physical network svcs -l metainit (check on Solaris Volume Manager initialization) blogs.sun.com examples for doing it by hand later there'll be a framework for creating them process rights management each process has four priviledge sets integrated with security framework list privs available for processes ppriv -l complementary to rbac: rule based access control "effective privilege set" can define what's inherited and what's not ppriv -v $$ lots of info about privs for a given process (your current shell is $$) zfs not in sol 10 fcs hopefully in sol 10 update 2 zettabyte fs - not spelled consistently in lecture: sometimes one t, sometimes two zfs external beta will be available very soon limits integrity checks ... 128 bits: bits, bytes or blocks 65 bit in 12 years zetabyte=70 bits zfs goes to 256 quadrillion zb quantum limit of earth based storage striped across devices- keeps track of device response times, and tries to go to fastest disks self-healing data dd to a dev in a zfs mirror zfs is fine, gives back good data, tells you there's an error if zfs can mirror, what does that mean for SVM? DMU treats disks as intetchangable parts for disk space alloc purposes contrasting ufs+svm with zfs ufs+svm many steps, zfs easier. zpool create "home" mirror(disk1,disk2) zfs mount -c home/ann zpool add "home" mirror(disk3,disk4) zfs will not be supported for root fs initially zfs supports (user) quotas cannot layer ufs overtop of zfs zfs has snapshot capability root ro like cdrom and writeable zfs group quotas? pool has fixed size, but any fs in pool can consume entire pool, barring quotas copy on write for zfs snapshots - don't have to break a mirror to accomlish this extra blocks only from same pool? Zfs: no silent data corruption: uses checksum for data integrity changed threads model: optional in 8 , def in 9 Atlas project small systems tuning janus project: run linux binaries on same-cpu solaris niagra: multicore, multithread per core low heat vertical threading: multiple threads multiplexed on a pipeline 05-4-10: 1.1 million downloads of Solaris 10 since january