Compiling Texy! with Phalanger

Texy! [1] is a convertor from text format (similar to formats used in some wiki applications) to valid XHTML code written in PHP. The syntax is described at Texy! web page [2]. Unfortunately, it is only in Czech language, but the syntax is very straightforward, so you can understand it without learning Czech :-). In this article, I'll show how to compile Texy! using Phalanger and how to use it in another .NET application written in C#.

Phalanger has two compilation modes, legacy mode which is compatible with any existing PHP code and pure mode which has some additional restrictions, but provides much better result in terms of .NET interoperability. You can find more details in the article Phalanger for .NET developers at our website [3]. For compiling Texy! I choose the pure mode, because this will produce assembly that is easy to use from C#. Texy! is also written in elegant object oriented manner, so it won't be difficult to modify it to fulfil additional code restrictions. First, what we have to do to make Texy! compatible with the Phalanger pure mode? The restrictions are following:


You can also use Phalanger Visual Studio Extensions if you don't want to compile it using command line. In this case create new Class Library Phalanger project and add all Texy! source files. If you want to use the command line, you can use the following command (you'll need to list all source files):

phpc /target:dll /pure /lang:CLR texy.php (other php sources...)

The target modifier specifies that Phalanger should produce .NET dll library, pure specifies that we're using pure compilation mode and finally, lang:CLR enables Phalanger language extensions that we will use (I'll talk about this later).

Modifying Texy!

If you try to compile it without any modifications, you'll get a few compiler errors consisting of the two kinds of errors. First error says "inclusions are not allowed in the pure unit" and the second says "global code is not allowed in the pure unit". These two errors are caused by pure mode limitations descibed earlier, so we'll need to modify the source code a bit to meet this two rules. Fortunately, Texy! contains only a few lines of global code, so it won't be very difficult. Moreover most of the global code lines are the following code:

if (!defined('TEXY')) die();

This code is here to ensure that script is used correctly if included, so you can safely remove these lines. You can also remove all includes from the texy.php file, because all source files are compiled together in pure mode. The includes in the source look like the following line:

equire_once TEXY_DIR.'...some file....php';

We have to do one last modification to make the source code compile correctly. Texy! uses several constants that are defined mostly in constants.php. Two constants are in texy.php and finally there is one more line of global code in html.php file. To make Texy! compatible with the pure mode we will move all constants and initialization code to one global function:

function initTexy()
  // from 'html.php'
  TexyHtml::$valid = array_merge(TexyHtml::$block, TexyHtml::$inline);

  // from 'texy.php'
  define('TEXY', 'Version 1.2 for PHP5 $Revision: 45 $');
  /* there was also a TEXY_DIR constant, but you can ignore it,
  because it was used only for inclusions */
  // from 'constants.php'
  define('TEXY_CHAR', 'A-Za-z\x86-\xff');
  /* ... more constants ... */

Now we have a function that performs the initialization and the source code can be successfuly compiled, but we'll do two little tweaks to make it easilly usable from C#. First, we'll mark the Texy class (which is used by library users) with the Export attribute which instructs the compiler to generate class that can be used from languages other than PHP (this is part of the PHP/CLR extensions available in Phalanger):

class Texy
  /* .. class source .. */

The second tweak will be adding automatic call to the initTexy function to the Texy class constructor, so that the function doesn't need to be called manually. Because we want to call it only once for application we'll add static field to the class to indicate whether we already performed the initialization:

static private $initalized = false;

public function __construct()
  if (!$initialized) { initTexy(); $initialized=true; }
  /* ... rest of the constructor ... */

And that's all :-). If you compile the project now, you'll get assembly that can be simply used from any other .NET language!

Using Texy! in C#

To demonstrate using Texy! in C# application I wrote little ASP.NET demo. After creating new web application, we'll need to copy the library produced by Phalanger to the bin directory, where ASP.NET automatically locates it. Now add the following controls to the default.aspx:

<asp:TextBox Columns="60" Rows="10" runat="server" 
  ID="txtTexy" TextMode="MultiLine" /><br />
<asp:Button runat="server" ID="btnOk" OnClick="btnOk_Click" Text=" OK " />
<hr />
<asp:Literal runat="server" ID="ltrOutput" />

In the code-behind file, we need to add event handler called when user clicks on the OK button:

protected void btnOk_Click(object sender, EventArgs e)
  // Create instance of Texy! parser
  Texy t = new Texy();
  // Call the 'process' method and cast result to string
  string parsed = (string)t.process(txtTexy.Text);
  // Display parsed text using literal
  ltrOutput.Text = parsed;

The demo application should be ready now! You can even examine all methods of the Texy class in the Visual Studio intellisense. The only overhead when calling PHP classes compiled using pure mode is that you have to cast the returned object to the correct type due to the dynamic nature of PHP language. In PHP you can't specify what type method returns, so when Phalanger compiles member function to .NET it declares return and parameter types as the object type.


The biggest limitation of this example is the fact that you need FullTrust permission to execute the PHP code compiled using Phalanger, which means that you can't use this Texy! library on most of the shared webhostings. We're however examining options for removing this limitation. Second problem is that there are a few issues with regular expression functions in latest Phalanger build. All the bugs are fixed now, but it will take some time before we'll release next beta version (but we'll do the best we can!). If you don't want to wait you can download Phalanger source code from the CodePlex and build it yourself.

Links and references

Published: Monday, 12 February 2007, 12:45 AM
Author: Tomas Petricek
Typos: Send me a pull request!
Tags: phalanger