Note: these documents may be out of date. Do not use as reference! |
To see what is currently happening visit http://www.perl6.org/
Welcome to the first summary from my new home in Gateshead. The same old wibble, with a different view from its window and fewer trips to London. Right, time to see what's been going on in perl6-internals this week.
The cryptically named TOGoS wondered how to get the address of a label in a different IMCC compilation unit. According to Dan there's no way to do that, and you didn't want to do that anyway.
After a few weeks of everyone else's proposals, Dan started to outline the design of Parrot's threading capabilities. He started by defining his terms (a useful thing to do in a field where there seem to me multiple competing definitions of various terms) and guaranteeing that user code wouldn't crash the interpreter (subject to the assumption that system level memory allocation was thread safe) before getting into the meat of his proposal. Which you're probably best reading for yourself; it's a long document but there's very little flab and any attempt of mine to summarize it would probably end up being at least as long as and a good deal less lucid than the original.
Of course, this sparked a ton of discussion, generally positive, as people asked for various clarifications and made suggestions. Gordon Henriksen pointed out a nasty race condition that means that the garbage collector can't be made as thread safe as Dan had hoped.
Summarizer Barbie says "Threads are hard!"
On Thursday, Dan posted a last call for comments and objections before he went on to the detailed design. This time there were some objections, but I don't think any of 'em are going to stop Dan.
groups.google.com[10.0.1.2]
groups.google.com[10.0.1.4]
Last week Dan had outlined an approach to organizing PMC vtables using a chaining approach; this week saw the discussion of that proposal with Benjamin K. Stuhl asking the hard questions.
Matt Fowles suggested that it might make sense to create a canonical suite of benchmarks to exercise Parrot well. His idea being that, if we have a standard suite of Parrot benchmarks, then potential performance affecting changes could be tested against that, rather than having custom benchmarks rolled each time. Luke Palmer pointed to examples/benchmarks and noted that it's hard to create benchmarks that test everything. However, he hoped that any good benchmark that gets posted to the list would get added to this suite, along with some documentation describing what is being tested.
Dan did some more designing, this time mandating that Parrot will,
eventually adopt ICU's formatting template for numeric templates but,
to start with, we'll be rolling our own. The new op will be format
Sx, [INP]y, [SP]z
.
groups.google.com[10.0.1.2]
Dan announced that he would be adding upcase
, downcase
,
titlecase
and to_chartype
to the various chartype vtables. He
also noted that he'd like to get some alternative chartypes and
encodings into Parrot as soon as possible to make sure we can actually
handle things without having to use Unicode all the time.
groups.google.com[10.0.1.2]
Will Coleda had some problems with IMCC's handling of the parrot calling conventions when he found that code that worked a couple of months ago had stopped working in the current Parrot (A month is a *very* long time in Parrot development though.) The problem took a fair bit of tracking down and I'm not entirely sure it's entirely fixed yet; Will had reached the point where the code would compile, but it still wouldn't actually run.
Steve Fink posted an obnoxious test case that generated memory corruption. The test case is obnoxious because it's 56KB of IMCC source code, and Steve had been unable to reduce it. This didn't discourage Leo Tötsch though, who set about tracking the bug to its lair. It's not fixed yet, but with the Patchmonster on the case it can only be a matter of time.
There were several other GC related issues that cropped up over the week; I wonder if they're all aspects of a single lurking bug.
Steve Fink also found a problem with IMCC failing to properly return integers from unprototyped routines and posted an appropriate patch to the test suite. It turns out that the problem is that IMCC doesn't quite implement the full Parrot Calling Conventions, especially the return convention, but it's getting there.
Leo Töposted a test program and some results for timing the difference between using shared and unshared PMCs. The shared versions are (not surprisingly) slower than the unshared ones; the question is whether the difference between the two can be improved. Hopefully the benchmark will get checked into examples/benchmarks as suggested by Luke earlier.
Dan noted that we have "a pile of different array classes with fuzzy requirements and specified behaviours which sort of inherit from each other except when they don't." He suggested that the time had come to work out what we actually want in the way of array classes, compare our requirements with what we have, and then to do something about making what was available match what was required. I'm not sure that the resulting discussion has finalized the set of array types needed, but it's getting there. (Does anyone else think 'FixedMixedArray' is awfully clumsy as names go?).
groups.google.com[10.0.1.2]
Robert Spier announced that repairing the web accessible TODO list was on his personal TODO list and asked to be nagged about it periodically.
Robert, remember you need to fix the web accessible TODO list.
Effortlessly Godwinning himself, Uri Guttman pointed to a press release which stated that Winston Churchill's parrot, Charlie, is now 104 years old and can still be coaxed into squawking "Fuck Hitler" and "Fuck the Nazis" which had apparently made it rather unsuitable for keeping at its owner's pet shop due to its habit of swearing at children.
Michael Scott continued his sterling work of updating and generally improving Parrot's documentation. This week his attention fell upon: the Perl scripts found in build_tools, classes and tools/dev. Top man that he is, he's currently working on the documentation embedded in C code.
groups.google.com[10.0.1.4]
Robert Spier (don't forget the web accessible todo list Robert) posted a list of the 177 currently outstanding Parrot issues in the RT system and asked for volunteers to go through them to help weed out those issues that were no longer current. So people did. Which is nice.
Michal Wallace is trying to make a dynamically loaded PMC that subclasses another dynamically loaded PMC and he can't work out how to do it. Leo Tötsch had the answer.
Nigel Sandever wondered how Parrot would handle eval
opcodes for
multiple different languages. Leo pointed him at the compile
op,
which (while it isn't fully implemented yet) will address this issue.
Dan noted that it's currently working for PIR and PASM code but that
it should be able to work eventually with any compiler that generates
standard bytecode.
Leo is working on turning OS level signals into Parrot level events, and he's not having an easy time of it. He posted a summary of the issues and asked for comments. Discussion continues.
Gordon Henriksen worries that Parrot's current architecture is
actively thread hostile. He also accepted that trying to change it now
wasn't really possible. So he outlined various ways in which the need
for locking could be reduced, which should help speed things up. The
big problem, as Gordon sees it, is that so many Parrot data structures
are mutable, and mutable data structures require locks. And having
PMCs that can morph from one type to another is... well, Gordon claims
that morph
must die, though he later modified this claim. He and
Leo batted this back and forth for a while; I'm not sure either side
is convinced.
Mattia Barbon noted that the embedding and extending interfaces were
still using different names for Parrot_Interp
and
Parrot_INTERP
. He wondered which was correct. It turns out that
nobody's quite sure, but the person who can make the decision -- Dan
-- was en route to Copenhagen when this came up, so there's no answer
yet.
Determined to test everyone's Unicode readiness, Luke Palmer kicked
off a discussion of the semantics of [1,2,3] \xBB+\xAB
[4,5,6]
. At first glance it looks like the result should be
[5,7,9]
, but Luke argued that actually, the code was trying to add
two lists, each containing a single scalar, that just happened to be
listrefs. Larry pointed out that "Doing what you expect at first
glance is also called 'not violating the principle of least
surprise'", before going on to surprise us all with 'lopsided' vector
ops, which would allow the programmer to specify when a value was
expected to be treated as a scalar:
$a \xBB+\xAB $b # Treat $a and $b as lists $x +\xAB $y # Treat $x as a scalar and $b as a list -\xAB @bar # Return a list of every element of @bar, negated @foo \xBB+ @bar # Add the length of @bar to every element of @foo
Then he scared me with @foo \xBB+= @foo
. He noted that it might take
some getting used to, but that it helped if you pronounce \xBB
and
\xAB
as 'each'. Austin Hastings didn't like it (from a syntax
highlighting point of view), but he appeared to be outvoted. Larry
pointed out that \xAB\xBB
etc were the least of a syntax highlighters
worries given that any use
, eval
or operator
declaration had the potential to morph any subsequent syntax. Piers
Cawley thought that truly accurate syntax highlighting would have
to be done in an image based IDE implemented in Perl because such an
editor would always know what rules were in scope for a given chunk of
code. A. Pagaltzis thought that this would definitely increment the
Smalltalkometer for Perl 6.
As discussion and exploration of this idea continued it became
apparent that people seem to like this particular weirding of the
language, and it certainly allows the programmer to disambiguate
things rather neatly. Luke even pointed out that this new approach
allows for calling a method on a list of values: @list \xBB.method
,
and to call a list of methods on a value: $value.\xAB @methods
.
Then the fun began. The issue is that \xBB
and \xAB
can also be
written <<
and >>
(but your POD processor hates you
for it). This leads to ambiguities like
>>+<<=<<
(which are even harder to type in
a Pod escape) which can be parsed as \xBB+<<=\xAB
or
\xBB+\xAB=\xAB
. Larry wondered if the problem arose because of trying to
make the <
and lt
>>
alternatives look too similar
to the Unicode glyphs.
You know, looking at that last paragraph I can see why people think Perl 6 is horribly scary. The thing is, you're not expected to use constructions like that in real world programs all the time; but when you're working out what a grammar should be you have to think of all the nasty edge cases to see where things break.
Anyway, such nastiness led to the possibility of introducing a
'whitespace eating' macro which would allow for the introduction of
disambiguating whitespace. The front runner for this macro is _
.
Remember a few months ago when there was some discussion of replacing the C style comma with some other glyph? If that were done, one of the consequences would be that
@foo = 1,2,3;
would fill @foo
with three elements instead of just the one as it
does in Perl 5. Joe Gottman had a few questions about the implications
of that, and wondered if Larry had actually ruled on it. Larry ruled
that list construction would continue to require brackets (or, if
you're American, parentheses) and went on to discuss some further
implications of that.
Thankfully, this section's normal service is resumed this week. The only catch is, I can't think of anything to say.
However, if you find these summaries useful or enjoyable, please consider contributing to the Perl Foundation to help support the development of Perl. You might also like to send me feedback at [email protected]'>p[email protected], or drop by my website (New! Improved! listening on port 80! Still sadly lacking in desperately new content!)
donate.perl-foundation.org -- The Perl Foundation
dev.perl.org -- Perl 6 Development site
www.bofh.org.uk -- My website, "Just a Summary"